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20 

Background of the Invention 

Certain products and by-products of naturally-occurring metabolic processes in 
cells have utility in a wide array of industries, including the food, feed, cosmetics, and 
pharmaceutical industries. These molecules, collectively termed 'fine chemicals', 

25 include organic acids, both proteinogenic and non-proteinogenic amino acids, 

nucleotides and nucleosides, lipids and fatty acids, diols, carbohydrates, aromatic 
compounds, vitamins and cofactors, and enzymes. Their production is most 
conveniently performed through the large-scale culture of bacteria developed to produce 
and secrete large quantities of one or more desired molecules. One particularly useful 

30 organism for this purpose is Corynebacterium glutamicum^ a gram positive, 

nonpathogenic bacterium. Through strain selection, a number of mutant strains have 
been developed which produce an array of desirable compounds. However, selection of 
strains improved for the production of a particular molecule is a time-consuming and 
difficult process.- 
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Summary of the Invention 

The invention provides novel bacterial nucleic acid molecules which have a 
variety of uses. These uses include the identification of microorganisms which can be 
used to produce fine chemicals, the modulation of fine chemical production in C 
glutamicum or related bacteria, the typing or identification of C glutamicum or related 
bacteria, as reference points for mapping the C glutamicum genome, and as markers for 
transformation. These novel nucleic acid molecules encode proteins, referred to herein 
as metabolic regulatory (MR) proteins. 

C. glutamicum is a gram positive, aerobic bacterium which is commonly used in 
industry for the large-scale production of a variety of fine chemicals, and also for the 
degradation of hydrocarbons (such as in petroleum spills) and for the oxidation of 
terpenoids. The MR nucleic acid molecules of the invention, therefore, can be used to 
identify microorganisms which can be used to produce fine chemicals, e.g., by 
fermentation processes. Modulation of the expression of the MR nucleic acids of the 
invention, or modification of the sequence of the MR nucleic acid molecules of the 
invention, can be used to modulate the production of one or more fine chemicals from a 
microorganism {e.g., to improve the yield or production of one or more fine chemicals 
from a Corynebacterium or Brevibacterium species). 

The MR nucleic acids of the invention may also be used to identify an organism 
as being Corynebacterium glutamicum or a close relative thereof, or to identify the 
presence of C glutamicum or a relative thereof in a mixed population of 
microorganisms. The invention provides the nucleic acid sequences of a number of C. 
glutamicum genes; by probing the extracted genomic DNA of a culture of a unique or 
mixed population of microorganisms under stringent conditions with a probe spanning a 
region of a C. glutamicum gene which is unique to this organism, one can ascertain 
whether this organism is present. Although Corynebacterium glutamicum itself is 
nonpathogenic, it is related to species pathogenic in humans, such as Corynebacterium 
diphtheriae (the causative agent of diphtheria); the detection of such organisms is of 
significant clinical relevance. 

The MR nucleic acid molecules of the invention may also serve as reference 
points for mapping of tnte C. glutamicum genome, or of genomes of related organisms. 
Similarly, these moleculesior variants or portions thereof, may serve as markers for 
genetically engineered Corj^bacterium or Brevibacterium species. 
e.g.. . The MR proteins encoded by the novel nucleic acid molecules of the invention 
are capable of, for example, peribrming a function involved in the transcriptional, 
translational, or posttranslational rWulation of proteins important for the normal 
metabolic functioning of cells. Give\the availability of cloning vectors for use in 
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Corynebacterium glutahaicum^ such as those disclosed in Sinskey et al.^ U.S. Patent No. 
4,649,1 19, and technique^ for genetic manipulation of C glutamicum and the related 
Brevibacterium species {e\^ lactofermentum) (Yoshihama et al, J. Bacterial, 162: 591- 
597(1985); Katsuniata BacterioL 159:306-311 (1984); and Santamaria ^ar/. , J. 

Gen. Microbiol. 130: 2237-2246 (1984)), the nucleic acid molecules of the invention 
may be utilized in the genetic engineering of this organism to make it a better or more 
efficient producer of one or more Wie chemicals. 

This improved yield, production and/or efficiency of production of a fine 
chemical may be due to a direct effect of manipulation of a gene of the invention, or it 
1 0 may be due to an indirect effect of such manipulation. Specifically, alterations in C 
glutamicum MR proteins which normally regulate the yield, production and/or 
efficiency of production of a fine chemical metabolic pathways may have a direct 
impact on the overall production or rate of production of one or more of these desired 
compounds from this organism. Alterations in the proteins involved in these metabolic 
1 5 pathways may also have an indirect impact on the yield, production and/or efficiency of 
production of a desired fine chemical. Regulation of metabolism is necessarily 
complex, and the regulatory mechanisms governing different pathways may intersect at 
multiple points such that more than one pathway can be rapidly adjusted in accordance 
"J' with a particular cellular event. This enables the modification of a regulatory protein for 

20 one pathway to have an impact on the regulation of many other pathways as well, some 
of which may be involved in the biosynthesis or degradation of a desired fine chemical. 
In this indirect fashion, the modulation of action of an MR protein may have an impact 
on the production of a fine chemical produced by a pathway different from one which 
that MR protein directly regulates. 
25 The nucleic acid and protein molecules of the invention may be utilized to 

directly improve the yield, production, and/or efficiency of production of one or more 
desired fine chemicals from Corynebacterium glutamicum. Using recombinant genetic 
techniques well known in the art, one or more of the regulatory proteins of the invention 
may be manipulated such that its function is modulated. For example, the mutation of an 
30 MR protein involved in the repression of transcription of a gene encoding an enzyme 
which is required for the biosynthesis of an amino acid such that it no longer is able to 
repress transcription may result in an increase in production of that amino acid. 
Similarly, the alteration of activity of an MR protein resulting in increased translation or 
activating posttranslational modification of a C. glutamicum protein involved in the 
35 biosynthesis of a desired fine chemical may in turn increase the production of that 

chemical. The opposite situation may also be of benefit: by increasing the repression of 
transcription or translation, or by posttranslational negative modification of a C 
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glutamicum protein involved in the regulation of a degradative pathway for a compound, 
one may increase the production of this chemical. In each case, the overall yield or rate 
of production of the desired fine chemical may be increased. 

It is also possible that such alterations in the protein and nucleotide molecules of 
5 the invention may improve the yield, production, and/or efficiency of production of fine 
chemicals through indirect mechanisms. The metabolism of any one compound is 
necessarily intertwined with other biosynthetic and degradative pathways within the cell, 
and necessary cofactors, intermediates, or substrates in one pathway are likely supplied 
or limited by another such pathway. Therefore, by modulating the activity of one or 
10 more of the regulatory proteins of the invention, the production or efficiency of activity 
of another fine chemical biosynthetic or degradative pathway may be impacted. Further, 
the manipulation of one or more regulatory proteins may increase the overall ability of 
the cell to grow and multiply in culture, particularly in large-scale fermentative culture, 
^5 where growth conditions may be suboptimal. For example, by mutating an MR protein 

m 15 of the invention which would normally cause a repression in the biosynthesis of 

Jff nucleotides in response to suboptimal extracellular supplies of nutrients (thereby 

ig preventing cell division) such that it is decreased in repressor ability, one may increase 

2 the biosynthesis of nucleotides and perhaps increase cell division. Changes in MR 

proteins which result in increased cell growth and division in culture may result in an 
CP 20 increase in yield, production, and/or efficiency of production of one or more desired fine 

JJ i chemicals from the culture, due at least to the increased number of cells producing the 

hi chemical in the culture. 

The invention provides novel nucleic acid molecules which encode proteins, 
referred to herein as metabolic pathway proteins (MR), which are capable of, for 
25 example, performing an enzymatic step involved in the transcriptional, translational, or 
posttranslational regulation of metabolic pathways in C glutamicum. Nucleic acid 
molecules encoding an MR protein are referred to herein as MR nucleic acid molecules. 
In a preferred embodiment, the MR protein participates in the transcriptional, 
translational, or posttranslational regulation of one or more metabolic pathways. 
30 Examples of such proteins include those encoded by the genes set forth in Table 1 . 

Accordingly, one aspect of the invention pertains to isolated nucleic acid 
molecules {e,g.^ cDNAs, DNAs, or RNAs) comprising a nucleotide sequence encoding 
an MR protein or biologically active portions thereof, as well as nucleic acid fragments 
suitable as primers or hybridization probes for the detection or amplification of MR- 
35 encoding nucleic acid (e.g., DNA or mRNA). In particularly preferred embodiments, 
the isolated nucleic acid molecule comprises one of the nucleotide sequences set forth in 
Appendix A or the coding region or a complement thereof of one of these nucleotide 
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sequences. In other particularly preferred embodiments, the isolated nucleic acid 
molecule of the invention comprises a nucleotide sequence which hybridizes to or is at 
least about 50%, preferably at least about 60%, more preferably at least about 70%, 80% 
or 90%, and even more preferably at least about 95%, 96%, 97%, 98%, 99% or more 
homologous to a nucleotide sequence set forth in Appendix A, or a portion thereof. In 
other preferred embodiments, the isolated nucleic acid molecule encodes one of the 
amino acid sequences set forth in Appendix B. The preferred MR proteins of the 
present invention also preferably possess at least one of the MR activities described 
herein. 

In another embodiment, the isolated nucleic acid molecule encodes a protein or 
portion thereof wherein the protein or portion thereof includes an amino acid sequence 
which is sufficiently homologous to an amino acid sequence of Appendix B, e.g.^ 
sufficiently homologous to an amino acid sequence of Appendix B such that the protein 
or portion thereof maintains an MR activity. Preferably, the protein or portion thereof 
encoded by the nucleic acid molecule maintains the ability to transcriptionally, 
translationally, or posttranslationally regulate a metabolic pathway in C glutamicum. 
In one embodiment, the protein encoded by the nucleic acid molecule is at least about 
50%, preferably at least about 60%, and more preferably at least about 70%, 80%, or 
90% and most preferably at least about 95%, 96%, 97%, 98%, or 99% or more 
homologous to an amino acid sequence of Appendix B (e.g., an entire amino acid 
sequence selected from those sequences set forth in Appendix B). In another preferred 
embodiment, the protein is a fiill length C glutamicum protein which is substantially 
homologous to an entire amino acid sequence of Appendix B (encoded by an open 
reading frame shown in Appendix A). 

In another preferred embodiment, the isolated nucleic acid molecule is derived 
from C glutamicum and encodes a protein (e.g., an MR fusion protein) which includes a 
biologically active domain which is at least about 50% or more homologous to one of 
the amino acid sequences of Appendix B and is able to transcriptionally, translationally, 
or posttranslationally regulate a metabolic pathway in C glutamicum, or has one or 
more of the activities set forth in Table 1 , and which also includes heterologous nucleic 
acid sequences encoding a heterologous polypeptide or regulatory regions. 

In another embodiment, the isolated nucleic acid molecule is at least 1 5 
nucleotides in length and hybridizes under stringent conditions to a nucleic acid 
molecule comprising a nucleotide sequence of Appendix A. Preferably, the isolated 
nucleic acid molecule corresponds to a naturally-occurring nucleic acid molecule. More 
preferably, the isolated nucleic acid encodes a naturally-occurring C glutamicum MR 
protein, or a biologically active portion thereof. 
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Another aspect of the invention pertains to vectors, e.g., recombinant expression 
vectors, containing the nucleic acid molecules of the invention, and host cells into which 
such vectors have been introduced. In one embodiment, such a host cell is used to 
produce an MR protein by culturing the host cell in a suitable medium. The MR protein 
5 can be then isolated from the medium or the host cell. 

Yet another aspect of the invention pertains to a genetically altered 
microorganism in which an MR gene has been introduced or altered. In one 
embodiment, the genome of the microorganism has been altered by introduction of a 
nucleic acid molecule of the invention encoding wild-type or mutated MR sequence as a 
10 transgene. In another embodiment, an endogenous MR gene within the genome of the 
microorganism has been altered, e.g., functionally disrupted, by homologous 
recombination with an altered MR gene. In another embodiment, an endogenous or 
introduced MR gene in a microorganism has been altered by one or more point 
mutations, deletions, or inversions, but still encodes a functional MR protein. In still 
15 another embodiment, one or more of the regulatory regions (e.g., a promoter, repressor, 
or inducer) of an MR gene in a microorganism has been altered (e.g., by deletion, 
J5 truncation, inversion, or point mutation) such that the expression of the MR gene is 

'""4 modulated. In a preferred embodiment, the microorganism belongs to the genus 

]^ Corynebacterium or Brevibacterium, with Corynebacterium glutamicum being 

o 20 particularly preferred. In a preferred embodiment, the microorganism is also utilized for 

the production of a desired compound, such as an amino acid, with lysine being 
particularly preferred. 

S In another aspect, the invention provides a method of identifying the presence or 

activity of Cornyebacterium diphtheriae in a subject. This method includes detection of 

25 one or more of the nucleic acid or amino acid sequences of the invention {e.g, the 
sequences set forth in Appendix A or Appendix B) in a subject, thereby detecting the 
presence or activity of Corynebacterium diphtheriae in the subject. 
Still another aspect of the invention pertains to an isolated MR protein or a portion, e.g., 
a biologically active portion, thereof. In a preferred embodiment, the isolated MR 

30 protein or portion thereof transcriptionally, translationally, or posttranslationally 
regulates one or more metabolic pathways in C, glutamicum. In another preferred 
embodiment, the isolated MR protein or portion thereof is sufficiently homologous to an 
amino acid sequence of Appendix B such that the protein or portion thereof maintains 
the ability to transcriptionally, translationally, or posttranslationally regulate one or 

35 more metabolic pathways in C. glutamicum. 

The invention also provides an isolated preparation of an MR protein. In 
preferred embodiments, the MR protein comprises an amino acid sequence of Appendix 
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B. In another preferred embodiment, the invention pertains to an isolated full length 
protein which is substantially homologous to an entire amino acid sequence of Appendix 
B (encoded by an open reading frame set forth in Appendix A). In yet another 
embodiment, the protein is at least about 50%, preferably at least about 60%, and more 
preferably at least about 70%, 80%, or 90%, and most preferably at least about 95%, 
96%, 97%, 98%, or 99% or more homologous to an entire amino acid sequence of 
Appendix B. In other embodiments, the isolated MR protein comprises an amino acid 
sequence which is at least about 50% or more homologous to one of the amino acid 
sequences of Appendix B and is able to transcriptionally, translationally, or 
posttranslationally regulate one or more metabolic pathways in C glutamicum, or has 
one or more of the activities set forth in Table 1 . 

Altematively, the isolated MR protein can comprise an amino acid sequence 
which is encoded by a nucleotide sequence which hybridizes, e.g., hybridizes under 
stringent conditions, or is at least about 50%, preferably at least about 60%, more 
preferably at least about 70%, 80%, or 90%, and even more preferably at least about 
95%, 96%, 97%, 98,%, or 99% or more homologous, to a nucleotide sequence of 
Appendix B. It is also preferred that the preferred forms of MR proteins also have one 
or more of the MR bioactivities described herein. 

The MR polypeptide, or a biologically active portion thereof, can be operatively 
linked to a non-MR polypeptide to form a fusion protein. In preferred embodiments, 
this fusion protein has an activity which differs from that of the MR protein alone. In 
other preferred embodiments, this fusion protein transcriptionally, translationally, or 
posttranslationally regulates one or more metabolic pathways in C. glutamicum. In 
particularly preferred embodiments, integration of this fusion protein into a host cell 
modulates production of a desired compound from the cell. 

In another aspect, the invention provides methods for screening molecules which 
modulate the activity of an MR protein, either by interacting with the protein itself or a 
substrate or binding partner of the MR protein, or by modulating the transcription or 
translation of an MR nucleic acid molecule of the invention. Another aspect of the 
invention pertains to a method for producing a fine chemical. This method involves the 
culturing of a cell containing a vector directing the expression of an MR nucleic acid 
molecule of the invention, such that a fine chemical is produced. In a preferred 
embodiment, this method further includes the step of obtaining a cell containing such a 
vector, in which a cell is transfected with a vector directing the expression of an MR 
nucleic acid. In another preferred embodiment, this method fiirther includes the step of 
recovering the fine chemical from the culture. In a particularly preferred embodiment. 
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the cell is from the genus Cory ne bacterium or Brevibacterium, or is selected from those 
strains set forth in Table 3. 

Another aspect of the invention pertains to methods for modulating production of 
a molecule from a microorganism. Such methods include contacting the cell with an 
5 agent which modulates MR protein activity or MR nucleic acid expression such that a 
cell associated activity is altered relative to this same activity in the absence of the 
agent. In a preferred embodiment, the cell is modulated for one or more C glutamicum 
metabolic pathway regulatory systems, such that the yields or rate of production of a 
desired fine chemical by this microorganism is improved. The agent which modulates 
1 0 MR protein activity can be an agent which stimulates MR protein activity or MR nucleic 
acid expression. Examples of agents which stimulate MR protein activity or MR nucleic 
acid expression include small molecules, active MR proteins, and nucleic acids encoding 
MR proteins that have been introduced into the cell. Examples of agents which inhibit 
MR activity or expression include small molecules and antisense MR nucleic acid 
Cfi 15 molecules. 

JrJ Another aspect of the invention pertains to methods for modulating yields of a 

|g desired compound from a cell, involving the introduction of a wild-type or mutant MR 

gene into a cell, either maintained on a separate plasmid or integrated into the genome of 
the host cell. If integrated into the genome, such integration can be random, or it can 
© 20 take place by homologous recombination such that the native gene is replaced by the 

%l introduced copy, causing the production of the desired compound from the cell to be 

u\ modulated. In a preferred embodiment, said yields are increased. In another preferred 

V embodiment, said chemical is a fine chemical. In a particularly preferred embodiment, 

said fine chemical is an amino acid. In especially preferred embodiments, said amino 
25 acid is L-lysine. 

Detailed Description of the Invention 

The present invention provides MR nucleic acid and protein molecules which are 
involved in the regulation of metabolism in Coryne bacterium glutamicum, including 

30 regulation of fine chemical metabolism. The molecules of the invention may be utilized 
in the modulation of production of fine chemicals from microorganisms, such as C 
glutamicum, either directly (e.g., where modulation of the activity of a lysine 
biosynthesis regulatory protein has a direct impact on the yield, production, and/or 
efficiency of production of lysine from that organism), or may have an indirect impact 

35 which nonetheless results in an increase in yield, production, and/or efficiency of 
production of the desired compound (e.g. , where modulation of the regulation of a 
nucleotide biosynthesis protein has an impact on the production of an organic acid or a 
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fatty acid from the bacterium, perhaps due to concomitant regulatory aherations in the 
biosynthetic or degradation pathways for these chemicals in response to the altered 
regulation of nucleotide biosynthesis). Aspects of the invention are further explicated 
below. 

5 

Fine Chemicals 

The term 'fine chemical' is art-recognized and includes molecules produced by 
an organism which have applications in various industries, such as, but not Umited to, 
the pharmaceutical, agriculture, and cosmetics industries. Such compounds include 

1 0 organic acids, such as tartaric acid, itaconic acid, and diaminopimelic acid, both 
proteinogenic and non-proteinogenic amino acids, purine and pyrimidine bases, 
nucleosides, and nucleotides (as described e,g. in Kuninaka, A. (1996) Nucleotides and 
related compounds, p. 561-612, in Biotechnology vol. 6, Rehm et al, eds. VCH: 
Weinheim, and references contained therein), lipids, both saturated and unsaturated fatty 

1 5 acids {e.g,^ arachidonic acid), diols (e.g., propane diol, and butane diol), carbohydrates 
(e.g.^ hyaluronic acid and trehalose), aromatic compounds (e.g.^ aromatic amines, 
vanillin, and indigo), vitamins and cofactors (as described in Ullmann's Encyclopedia of 
Industrial Chemistry, vol. A27, "Vitamins", p. 443-613 (1996) VCH: Weinheim and 
references therein; and Ong, A.S., Niki, E. & Packer, L. (1995) "Nutrition, Lipids, 

20 Health, and Disease" Proceedings of the UNESCO/Confederation of Scientific and 

Technological Associations in Malaysia, and the Society for Free Radical Research - i 
Asia, held Sept. 1-3, 1994 at Penang, Malaysia, AOCS Press, (1995)), enzymes, 
polyketides (Cane et al (1998) Science 282: 63-68), and all other chemicals described in 
Gutcho (1983) Chemicals by Fermentation, Noyes Data Corporation, ISBN: 

25 081 8805086 and references therein. The metabolism and uses of certain of these fine 
chemicals are further explicated below. 

A. Amino Acid Metabolism and Uses 

Amino acids comprise the basic structural units of all proteins, and as such are 

30 essential for normal cellular functioning in all organisms. The term "amino acid" is art- 
recognized. The proteinogenic amino acids, of which there are 20 species, serve as 
structural units for proteins, in which they are linked by peptide bonds, while the 
nonproteinogenic amino acids (hundreds of which are known) are not normally found in 
proteins (see Ulmann's Encyclopedia of Industrial Chemistry, vol. A2, p. 57-97 VCH: 

35 Weinheim (1985)). Amino acids may be in the D- or L- optical configuration, though L- 
amino acids are generally the only type found in naturally-occurring proteins. 
Biosynthetic and degradative pathways of each of the 20 proteinogenic amino acids 
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have been well characterized in both prokaryotic and eukaryotic cells (see, for example, 
Stryer, L. Biochemistry, 3^^ edition, pages 578-590 (1988)). The 'essential' amino acids 
(histidine, isoleucine, leucine, lysine, methionine, phenylalanine, threonine, tryptophan, 
and valine), so named because they are generally a nutritional requirement due to the 
5 complexity of their biosyntheses, are readily converted by simple biosynthetic pathways 
to the remaining 1 1 'nonessential' amino acids (alanine, arginine, asparagine, aspartate, 
cysteine, glutamate, glutamine, glycine, proline, serine, and tyrosine). Higher animals 
do retain the ability to synthesize some of these amino acids, but the essential amino 
acids must be supplied from the diet in order for normal protein synthesis to occur. 

10 Aside from their function in protein biosynthesis, these amino acids are 

interesting chemicals in their own right, and many have been found to have various 
applications in the food, feed, chemical, cosmetics, agriculture, and pharmaceutical 
industries. Lysine is an important amino acid in the nutrition not only of humans, but 
also of monogastric animals such as poultry and swine. Glutamate is most commonly 

1 5 used as a flavor additive (mono-sodium glutamate, MSG) and is widely used throughout 
the food industry, as are aspartate, phenylalanine, glycine, and cysteine. Glycine, L- 
methionine and tryptophan are all utilized in the pharmaceutical industry. Glutamine, 
valine, leucine, isoleucine, histidine, arginine, proline, serine and alanine are of use in 
both the pharmaceutical and cosmetics industries. Threonine, tryptophan, and D/ L- 

20 methionine are common feed additives. (Leuchtenberger, W. (1996) Amino aids - 
technical production and use, p. 466-502 in Rehm et aL (eds.) Biotechnology vol. 6, 
chapter 14a, VCH: Weinheim). Additionally, these amino acids have been found to be 
useful as precursors for the synthesis of synthetic amino acids and proteins, such as N- 
acetylcysteine, S-carboxymethyl-L-cysteine, (S)-5-hydroxytryptophan, and others 

25 described in Ulmarm's Encyclopedia of Industrial Chemistry, vol. A2, p. 57-97, VCH: 
Weinheim, 1985. 

The biosynthesis of these natural amino acids in organisms capable of 
producing them, such as bacteria, has been well characterized (for review of bacterial 
amino acid biosynthesis and regulation thereof, see Umbarger, H.E.(1978) Ann. Rev. 

30 Biochem, 47: 533-606). Glutamate is synthesized by the reductive amination of a- 

ketoglutarate, an intermediate in the citric acid cycle. Glutamine, proline, and arginine 
are each subsequently produced from glutamate. The biosynthesis of serine is a three- 
step process beginning with 3-phosphoglycerate (an intermediate in glycolysis), and 
resulting in this amino acid after oxidation, transamination, and hydrolysis steps. Both 

35 cysteine and glycine are produced from serine; the former by the condensation of 
^ homocysteine with serine, and the latter by the transferal of the side-chain p-carbon 
atom to tetrahydrofolate, in a reaction catalyzed by serine transhydroxymethylase. 
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Phenylalanine, and tyrosine are synthesized from the glycolytic and pentose phosphate 
pathway precursors erythrose 4-phosphate and phosphoenolpyruvate in a 9-step 
biosynthetic pathway that differ only at the final two steps after synthesis of prephenate. 
Tryptophan is also produced from these two initial molecules, but its synthesis is an 11- 
5 step pathway. Tyrosine may also be synthesized from phenylalanine, in a reaction 
catalyzed by phenylalanine hydroxylase. Alanine, valine, and leucine are all 
biosynthetic products of pyruvate, the final product of glycolysis. Aspartate is formed 
from oxaloacetate, an intermediate of the citric acid cycle. Asparagine, methionine, 
threonine, and lysine are each produced by the conversion of aspartate. Isoleucine is 

10 formed from threonine. A complex 9-step pathway results in the production of histidine 
from 5-phosphoribosyl-l -pyrophosphate, an activated sugar. 

Amino acids in excess of the protein synthesis needs of the cell cannot be stored, 
and are instead degraded to provide intermediates for the major metabolic pathways of 
the cell (for review see Stryer, L. Biochemistry 3'^'* ed. Ch. 21 "Amino Acid Degradation 

15 and the Urea Cycle" p. 495-516 (1988)). Although the cell is able to convert unwanted 
amino acids into useful metabolic intermediates, amino acid production is costly in 
terms of energy, precursor molecules, and the enzymes necessary to synthesize them. 
Thus it is not surprising that amino acid biosynthesis is regulated by feedback inhibition, 
in which the presence of a particular amino acid serves to slow or entirely stop its own 

20 production (for overview of feedback mechanisms in amino acid biosynthetic pathways, 
see Stryer, L. Biochemistry, 3*^^ ed. Ch. 24: "Biosynthesis of Amino Acids and Heme" p. 
575-600 (1988)). Thus, the output of any particular amino acid is limited by the amount 
of that amino acid present in the cell. 

25 B. Vitamin, Cofactor, and Nutraceutical Metabolism and Uses 

Vitamins, cefaclors, and nutraceutical s comprise another group of molecules 
which the higher animals have lost the ability to synthesize and so must ingest, although 
they are readily synthesized by other organisms such as bacteria. These molecules are 
either bioactive substances themselves, or are precursors of biologically active 

30 substances which may serve as electron carriers or intermediates in a variety of 
metabolic pathways. Aside from their nutritive value, these compounds also have 
significant industrial value as coloring agents, antioxidants, and catalysts or other 
processing aids. (For an overview of the structure, activity, and industrial applications 
of these compounds, see, for example, UUman's Encyclopedia of Industrial Chemistry, 

35 "Vitamins" vol. A27, p. 443-613, VCH: Weinheim, 1996.) The term "vitamin" is art- 
recognized, and includes nutrients which are required by an organism for normal 
functioning, but which that organism cannot synthesize by itself. The group of vitamins 
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may encompass cofactors and nutraceutical compounds. The language "co factor" 
includes nonproteinaceous compounds required for a normal enzymatic activity to 
occur. Such compoimds may be organic or inorganic; the co factor molecules of the 
invention are preferably organic. The term "nutraceutical" includes dietary supplements 
5 having health benefits in plants and animals, particularly humans. Examples of such 
molecules are vitamins, antioxidants, and also certain lipids (e.g,, polyunsaturated fatty 
acids). 

The biosynthesis of these molecules in organisms capable of producing them, 
such as bacteria, has been largely characterized (UUman's Encyclopedia of Industrial 

10 Chemistry, "Vitamins" vol. A27, p. 443-613, VCH: Weinheim, 1996; Michal, G. (1999) 
Biochemical Pathways: An Atlas of Biochemistry and Molecular Biology, John Wiley 
& Sons; Ong, A.S., Niki, E. & Packer, L, (1995) "Nutrition, Lipids, Health, and 
Disease" Proceedings of the UNESCO/Confederation of Scientific and Technological 
Associations in Malaysia, and the Society for Free Radical Research - Asia, held Sept. 

15 1-3, 1994 at Penang, Malaysia, AOCS Press: Champaign, IL X, 374 S), 

Thiamin (vitamin Bi) is produced by the chemical coupling of pyrimidine and. 
thiazole moieties. Riboflavin (vitamin B2) is synthesized from guanosine-5' -triphosphate 
(OTP) and ribose-5 '-phosphate. Riboflavin, in turn, is utilized for the synthesis of flavin 
mononucleotide (FMN) and flavin adenine dinucleotide (FAD). The family of 

20 compounds collectively termed 'vitamin Be' {e.g., pyridoxine, pyridoxamine, pyridoxa- 
5 '-phosphate, and the commercially used pyridoxin hydrochloride) are all derivatives of 
the common structural unit, 5-hydroxy-6-methylpyridine. Pantothenate (pantothenic 
acid, (R)-(+)-N-(2,4-dihydroxy-3,3-dimethyl-l-oxobutyl)-p-alanine) can be produced 
either by chemical synthesis or by fermentation. The final steps in pantothenate 

25 biosynthesis consist of the ATP-driven condensation of P-alanine and pantoic acid. The 
enzymes responsible for the biosynthesis steps for the conversion to pantoic acid, to P- 
alanine and for the condensation to panthotenic acid are known. The metabolically 
active form of pantothenate is Coenzyme A, for which the biosynthesis proceeds in 5 
enzymatic steps. Pantothenate, pyridoxal-5' -phosphate, cysteine and ATP are the 

30 precursors of Coenzyme A. These enzymes not only catalyze the formation of 
panthothante, but also the production of (R)-pantoic acid, (R)-pantolacton, (R)- 
panthenol (provitamin B5), pantetheine (and its derivatives) and coenzyme A. 

Biotin biosynthesis from the precursor molecule pimeloyl-CoA in 
microorganisms has been studied in detail and several of the genes involved have been 

35 identified. Many of the corresponding proteins have been found to also be involved in 
Fe-cluster synthesis and are members of the nifS class of proteins. Lipoic acid is 
derived from octanoic acid, and serves as a coenzyme in energy metabolism, where it 
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becomes part of the pyruvate dehydrogenase complex and the a-ketoglutarate 
dehydrogenase complex. The folates are a group of substances which are all deriyatives 
of folic acid, which is turn is derived from L-glutamic acid, p-amino-benzoic acid and 6- 
methylpterin. The biosynthesis of folic acid and its derivatives, starting from the 
5 metabolism intermediates guanosine-5' -triphosphate (GTP), L-glutamic acid and p- 
amino-benzoic acid has been studied in detail in certain microorganisms. 

Corrinoids (such as the cobalamines and particularly vitamin B12) and 
porphyrines belong to a group of chemicals characterized by a tetrapyrole ring system. 
The biosynthesis of vitamin B12 is sufficiently complex that it has not yet been 

10 completely characterized, but many of the enzymes and substrates involved are now 

known. Nicotinic acid (nicotinate), and nicotinamide are pyridine derivatives which are 
also termed 'niacin'. Niacin is the precursor of the important coenzymes NAD 
(nicotinamide adenine dinucleotide) and NADP (nicotinamide adenine dinucleotide 
phosphate) and their reduced forms. 

15 The large-scale production of these compounds has largely relied on cell-free 

chemical syntheses, though some of these chemicals have also been produced by large- 
scale culture of microorganisms, such as riboflavin. Vitamin B6, pantothenate, and 
biotin. Only Vitamin B12 is produced solely by fermentation, due to the complexity of 
its synthesis. In vitro methodologies require significant inputs of materials and time, 

20 often at great cost. 

C Purine, Pyrimidine, Nucleoside and Nucleotide Metabolism and Uses 

Purine and pyrimidine metabolism genes and their corresponding proteins are 
important targets for the therapy of tumor diseases and viral infections. The language 

25 "purine" or "pyrimidine" includes the nitrogenous bases which are constituents of 
nucleic acids, co-enzymes, and nucleotides. The term "nucleotide" includes the basic 
structural units of nucleic acid molecules, which are comprised of a nitrogenous base, a 
pentose sugar (in the case of RNA, the sugar is ribose; in the case of DNA, the sugar is 
D-deoxyribose), and phosphoric acid. The language "nucleoside" includes molecules 

30 which serve as precursors to nucleotides, but which are lacking the phosphoric acid 
moiety that nucleotides possess. By inhibiting the biosynthesis of these molecules, or 
their mobilization to form nucleic acid molecules, it is possible to inhibit RNA and DNA 
synthesis; by inhibiting this activity in a fashion targeted to cancerous cells, the ability 
of tumor cells to divide and replicate may be inhibited. Additionally, there are 

35 nucleotides which do not form nucleic acid molecules, but rather serve as energy stores 
{i.e., AMP) or as coenzymes {i.e., FAD and NAD). 
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Several publications have described the use of these chemicals for these medical 
indications, by influencing purine and/or pyrimidine metabolism {e.g. Christopherson, 
R.I. and Lyons, S.D. (1990) "Potent inhibitors of de novo pyrimidine and purine 
biosynthesis as chemotherapeutic agents." Med. Res. Reviews 10: 505-548). Studies of 
5 enzymes involved in purine and pyrimidine metabolism have been focused on the 

development of new drugs which can be used, for example, as immunosuppressants or 
anti-proliferants (Smith, J.L., (1995) "Enzymes in nucleotide synthesis." Curr. Opin. 
Struct Biol. 5: 752-757; (1995) Biochem Soc. Transact, 23: 877-902). However, purine 
and pyrimidine bases, nucleosides and nucleotides have other utilities: as intermediates 

10 in the biosynthesis of several fine chemicals {e.g., thiamine, S-adenosyl-methionine, 
folates, or riboflavin), as energy carriers for the cell {e.g.^ ATP or GTP), and for 
chemicals themselves, commonly used as flavor enhancers {e.g., IMP or GMP) or for 
several medicinal applications (see, for example, Kuninaka, A. (1996) Nucleotides and 
Related Compounds in Biotechnology vol. 6, Rehm et al, eds. VCH: Weinheim, p. 561- 

15 612). Also, enzymes involved in purine, pyrimidine, nucleoside, or nucleotide 
metabolism are increasingly serving as targets against which chemicals for crop 
protection, including fungicides, herbicides and insecticides, are developed. 

The metabolism of these compounds in bacteria has been characterized (for 
reviews see, for example, Zalkin, H. and Dixon, J.E. (1992) ''de novo purine nucleotide 

20 biosynthesis", in: Progress in Nucleic Acid Research and Molecular Biology, vol. 42, 
Academic Press:, p. 259-287; and Michal, G. (1999) "Nucleotides and Nucleosides", 
Chapter 8 in: Biochemical Pathways: An Atlas of Biochemistry and Molecular Biology, 
Wiley: New York). Purine metabolism has been the subject of intensive research, and is 
essential to the normal functioning of the cell. Impaired purine metabolism in higher 

25 animals can cause severe disease, such as gout. Purine nucleotides are synthesized from 
ribose-5-phosphate, in a series of steps through the intermediate compound inosine-5'- 
phosphate (IMP), resulting in the production of guanosine-5' -monophosphate (GMP) or 
adenosine-5' -monophosphate (AMP), from which the triphosphate forms utilized as 
nucleotides are readily formed. These compounds are also utilized as energy stores, so 

30 their degradation provides energy for many different biochemical processes in the cell. 
Pyrimidine biosynthesis proceeds by the formation of uridine-5' -monophosphate (UMP) 
from ribose-5-phosphate. UMP, in turn, is converted to cytidine- 5 '-triphosphate (CTP). 
The deoxy- forms of all of these nucleotides are produced in a one step reduction 
reaction from the diphosphate ribose form of the nucleotide to the diphosphate 

35 deoxyribose form of the nucleotide. Upon phosphorylation, these molecules are able to 
participate in DNA synthesis. 
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D. Trehalose Metabolism and Uses 

Trehalose consists of two glucose molecules, bound in a, a- 1,1 linkage. It is 
commonly used in the food industry as a sweetener, an additive for dried or frozen 
foods, and in beverages. However, it also has applications in the pharmaceutical, 
5 cosmetics and biotechnology industries (see, for example, Nishimoto et al, (1998) U.S. 
Patent No. 5,759,610; Singer, M.A. and Lindquist, S. (1998) Trends Biotech. 16: 460- 
467; Paiva, C.L.A. and Panek, A.D. (1996) Biotech, Ann. Rev, 2: 293-314; and 
Shiosaka, M. (1997) J. Japan 172: 97-102). Trehalose is produced by enzymes from 
many microorganisms and is naturally released into the surrounding medium, from 
10 which it can be collected using methods known in the art. 

II. Mechanisms of Metabolic Regulation 

All living cells have complex catabolic and anabolic metabolic capabilities with 
many interconnected pathways. In order to maintain a balance between the various parts 

15 of this extremely complex metabolic network, the cell employs a finely-tuned regulatory 
network. By regulating enzyme synthesis and enzyme activity, either independently or 
simultaneously, the cell is able to control the activity of disparate metabolic pathways to 
reflect the changing needs of the cell. 

The induction or repression of enzyme synthesis may occur at either the level of 

20 transcription or translation, or both. Gene expression in prokaryotes is regulated by 
several mechanisms at the level of transcription (for review see e.g.^ Lewin, B (1990) 
Genes IV, Part 3 : "Controlling prokaryotic genes by transcription", Oxford University 
Press: Oxford, p. 213-301, and references therein, and Michal, G. (1999) Biochemical 
Pathways: An Atlas of Biochemistry and Molecular Biology, John Wiley & Sons). All 

25 such known regulatory processes are mediated by additional genes, which themselves 
respond to external influences of various kinds (e.g., temperature, nutrient availability, 
or light). Exemplary protein factors which have been implicated in this type of 
regulation include the transcription factors. These are proteins which bind to DNA, 
thereby either increasing the expression of a gene (positive regulation, as in the case of 

30 e.g. the ara operon from E. coli) or decreasing gene expression (negative regulation, as 
in the case of the lac operon from E. coli). These expression-modulating transcription 
factors can themselves be the subject of regulation. Their activity can, for example, be 
regulated by the binding of low molecular weight compounds to the DNA-binding 
protein, thereby stimulating (as in the case of arabinose for the ara operon) or inhibiting 

35 (as in the case of the lactose for the lac operon) the binding of these proteins to the 

appropriate binding site on the DNA (see, for example, Helmann, J.D. and Chamberlin, 
M.J. (1988) "Structure and ftinction of bacterial sigma factors." ^«/7. Rev. Biochem, 57: 
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839-872; Adhya, S. (1995) "The lac and gal operons today" and Boos, W. et aL, "The 
maltose system.", both in: Regulation of Gene Expression in Escherichia coli (Lin, 
E.C.C. and Lynch, A.S., eds.) Chapman & Hall: New York, p. 181-200 and 201-229; 
and Moran, CP. (1993) "RNA polymerase and transcription factors." in: Bacillus 
5 subtilis and other gram-positive bacteria, Sonenshein, A.L. et al.^ eds. ASM: 
Washington, D.C., p. 653-667.) 

Aside from the transcriptional level, protein synthesis is also often regulated at 
the level of translation. There are multiple mechanisms by which such regulation may 
occur, including alteration of the ability of the ribosome to bind to one or more mRNAs, 

10 binding of the ribosome to the mRNA, the maintenance or removal of mRNA secondary 
structure, the utilization of common or less common codons for a particular gene, the 
degree of abundance of one or more tRNAs, and special regulation mechanisms, such as 
attenuation (see Vellanoweth, R.L (1993) Translation and its regulation in Bacillus 
subtilis and other gram-positive bacteria, Sonenshein, A.L. et al., eds. ASM: 

15 Washington, D.C., p. 699-71 1 and references cited therein). 

Transcriptional and translational regulation may be targeted to a single protein 
(sequential regulation) or simultaneously to several proteins in different metabolic 
pathways (coordinate regulation). Often, genes whose expression is coordinately 
regulated are physically located near one another in the genome, in an operon or 

20 regulon. Such up- or down-regulation of gene transcription and translation is governed 
by the cellular and extracellular levels of various factors, such as substrates (precursor 
and intermediate molecules used in one or more metabolic pathways), catabolites 
(molecules produced by biochemical pathways concerned with the production of energy 
from the breakdown of complex organic molecules such as sugars), and end products 

25 (the molecules resulting at the end of a metabolic pathway). Typically, the expression 
of genes encoding enzymes necessary for the activity of a particular pathway is induced 
by high levels of substrate molecules for that pathway. Similarly, such gene expression 
tends to be repressed when there exist high intracellular levels of the end product of the 
pathway (Snyder, L. and Champness, W. (1997) The Molecular Biology of Bacteria 

30 ASM: Washington). Gene expression may also be regulated by other extemal and 
internal factors, such as environmental conditions (e.g., heat, oxidative stress, or 
starvation). These global environmental changes cause alterations in the expression of 
specialized modulating genes, which directly or indirectly (via additional genes or 
proteins) trigger the expression of genes by means of binding to DNA and thereby 

35 inducing or repressing transcription (see, for example, Lin, E.C.C. and Lynch, A.S., eds. 
(1995) Regulation of Gene Expression in Escherichia coli. Chapman & Hall: New 
York). 
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Yet another mechanism by which cellular metabolism may be regulated is at the 
level of the protein. Such regulation is accomplished either by the activities of other 
proteins, or by binding of low-molecular-weight components which either impede or 
enable the normal functioning of the protein. Examples of protein regulation by the 
5 binding of low-molecular- weight compounds include the binding of GTP or NAD. The 
binding of a low-molecular-weight chemical is typically reversible, as is the case with 
the GTP-binding proteins. These proteins exist in two stages (with bound GTP or GDP), 
one stage being the activated form of the protein, and one stage being inactive. 

Regulation of protein activity by the action of other enzynies typically takes the 

1 0 form of covalent modification of the protein (/. e, , phosphorylation of amino acid 

residues such as histidine or aspartate, or methylation). Such covalent modification is 
typically reversible, as mediated by an enzyme of the opposite activity. An example of 
this is the opposite activities of kinases and phosphorylases in protein phosphorylation; 
protein kinases phosphorylate specific residues on a target protein {e.g., serine or 

1 5 threonine), while protein phosphorylases remove phosphate groups from such proteins. 
Typically, enzymes which modulate the activity of other proteins are themselves 
modulated by external stimuli. These stimuli are mediated through proteins which 
function as sensors. A well known mechanism by which such sensor proteins may 
mediate these external signals is by dimerization, but others are also known (see, for 

20 example, Msadek, T. et al (1993) "Two-Component Regulatory Systems", in: Bacillus 
subtilis and Other Gram-Positive Bacteria, Sonenshein, A.L. et al, eds., ASM: 
Washington p. 729-745 and references cited therein). 

A thorough understanding of the regulatory networks goveming cellular 
metabolism in microorganisms is critical for the high-yield production of chemicals by 

25 fermentation. Control systems for the down-regulation of metabolic pathways could be 
removed or lessened to improve the synthesis of desired chemicals, and similarly, those 
for the up-regulation of metabolic pathways for a desired product could be constitutively 
activated or optimized in activity (As shown in Hirose, Y. and Okada, H. (1979) 
"Microbial Production of Amino Acids", in: Peppier, H.J. and Perlman, D. (eds.) 

30 Microbial Technology 2""* ed. Vol. 1, ch. 7 Academic Press: New York.) 

IIL Elements and Methods of the Invention 

The present invention is based, at least in part, on the discovery of novel 
molecules, referred to herein as MR nucleic acid and protein molecules, which regulate, 
35 by transcriptional, translational, or post-translational means, one or more metabolic 
pathways in C. glutamicum. In one embodiment, the MR molecules transcriptionally, 
translationally, or posttranslationally regulate a metabolic pathway in C glutamicum. In 
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a preferred embodiment, the activity of the MR molecules of the present invention to 
regulate one or more C glutamicum metabolic pathways has an impact on the 
production of a desired fine chemical by this organism. In a particularly preferred 
embodiment, the MR molecules of the invention are modulated in activity, such that the 
5 C. glutamicum metabolic pathways which the MR proteins of the invention regulate are 
modulated in efficiency or output, which either directly or indirectly modulates the 
yield, production, and/or efficiency of production of a desired fine chemical by C 
glutamicum. 

The language, "MR protein" or "MR polypeptide" includes proteins which 

10 transcriptionally, translationally, or posttranslationally regulate a metabolic pathway in 
C. glutamicum. Examples of MR proteins include those encoded by the MR genes set 
forth in Table 1 and Appendix A. The terms "MR gene" or "MR nucleic acid sequence" 
include nucleic acid sequences encoding an MR protein, which consist of a coding 
region and also corresponding untranslated 5' and 3' sequence regions. Examples of MR 

1 5 genes include those set forth in Table 1 . The terms "production" or "productivity" are 
art-recognized and include the concentration of the fermentation product (for example, 
the desired fine chemical) formed within a given time and a given fermentation volume 
(eg., kg product per hour per liter). The term "efficiency of production" includes the 
time required for a particular level of production to be achieved (for example, how long 

20 it takes for the cell to attain a particular rate of output of a fine chemical). The term 
"yield" or "product/carbon yield" is art-recognized and includes the efficiency of the 
conversion of the carbon source into the product {i.e. , fine chemical). This is generally 
written as, for example, kg product per kg carbon source. By increasing the yield or 
production of the compound, the quantity of recovered molecules, or of useful recovered 

25 molecules of that compound in a given amount of culture over a given amount of time is 
increased. The terms "biosynthesis" or a "biosynthetic pathway" are art-recognized and 
include the synthesis of a compound, preferably an organic compound, by a cell from 
intermediate compounds in what may be a multistep and highly regulated process. The 
terms "degradation" or a "degradation pathway" are art-recognized and include the 

30 breakdown of a compound, preferably an organic compound, by a cell to degradation 
products (generally speaking, smaller or less complex molecules) in what may be a 
multistep and highly regulated process. The language "metabolism" is art-recognized 
and includes the totality of the biochemical reactions that take place in an organism. 
The metabolism of a particular compound, then, {e.g., the metabolism of an amino acid 

35 such as glycine) comprises the overall biosynthetic, modification, and degradation 

pathways in the cell related to this compound. The term, "regulation" is art-recognized 
and includes the activity of a protein to govern the activity of another protein. The term. 
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"transcriptional regulation" is art-recognized and includes the activity of a protein to 
impede or activate the conversion of a DNA encoding a target protein to mRNA. The 
term, "translational regulation" is art-recognized and includes the activity of a protein to 
impede or activate the conversion of an mRNA encoding a target protein to a protein 
5 molecule. The term, "posttranslational regulation" is art-recognized and includes the 
activity of a protein to impede or improve the activity of a target protein by covalently 
modifying the target protein (e,g. , by methylation, glucosylation, or phosphorylation). 

In another embodiment, the MR molecules of the invention are capable of 
modulating the production of a desired molecule, such as a fine chemical, in a 

10 microorganism such as C glutamicum. Using recombinant genetic techniques, one or 
more of the regulatory proteins of the invention for metabolic pathways may be 
manipulated such that its function is modulated. For example, a biosynthetic enzyme 
may be improved in efficiency, or its allosteric control region destroyed such that 
feedback inhibition of production of the compound is prevented. Similarly, a 

1 5 degradative enzyme may be deleted or modified by substitution, deletion, or addition 

such that its degradative activity is lessened for the desired compound without impairing 
the viability of the cell. In each case, the overall yield or rate of production of one of 
these desired fine chemicals may be increased. 

It is also possible that such alterations in the protein and nucleotide molecules of 

20 the invention may improve the production of fine chemicals in an indirect fashion. The 
regulatory mechanisms of metabolic pathways in the cell are necessarily intertwined, 
and the activation of one pathway may lead to the repression or activation of another in 
a concomitant fashion. Therefore, by modulating the activity of one or more of the 
proteins of the invention, the production or efficiency of activity of another fine 

25 chemical biosynthetic or degradative pathway may be impacted. For example, by 

decreasing the ability of an MR protein to repress the transcription of a gene encoding a 
particular amino acid biosynthetic protein, one may concomitantly derepress other 
amino acid biosynthetic pathways, since these pathways are interrelated. Further, by 
modifying the MR proteins of the invention, one may uncouple the growth and division 

30 of cells from their extracellular surroundings to a certain degree; by impairing an MR 
protein which normally represses biosynthesis of a nucleotide when the extracellular 
conditions are suboptimal for growth and cell division such that it now lacks this 
function, one may permit growth to occur even when the extracellular conditions are 
poor. This is of particular relevance in large-scale fermentative growth, where 

35 conditions within the culture are often suboptimal in terms of temperature, nutrient 
supply or aeration, but would still support growth and cell division if the cellular 
regulatory systems for these factors were eliminated. 
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The isolated nucleic acid sequences of the invention are contained within the 
genome of a Corynebacterium glutamicum strain available through the American Type 
Culture Collection, given designation ATCC 13032. The nucleotide sequence of the 
isolated C glutamicum MR DNAs and the predicted amino acid sequences of the C. 
5 glutamicum MR proteins are shown in Appendices A and B, respectively. 

Computational analyses were performed which classified and/or identified these 
nucleotide sequences as sequences which encode metabolic pathway regulatory proteins. 

The present invention also pertains to proteins which have an amino acid 
sequence which is substantially homologous to an amino acid sequence of Appendix B. 

10 As used herein, a protein which has an amino acid sequence which is substantially 
homologous to a selected amino acid sequence is least about 50% homologous to the 
selected amino acid sequence, e.g.^ the entire selected amino acid sequence. A protein 
which has an amino acid sequence which is substantially homologous to a selected 
amino acid sequence can also be least about 50-60%, preferably at least about 60-70%, 

1 5 and more preferably at least about 70-80%, 80-90%, or 90-95%, and most preferably at 
least about 96%, 97%, 98%, 99% or more homologous to the selected amino acid 
sequence. 

The MR protein or a biologically active portion or fragment thereof of the 
invention can transcriptionally, translationally, or posttranslationally regulate a 
20 metabolic pathway in C glutamicum^ or have one or more of the activities set forth in 
Table 1. 

Various aspects of the invention are described in further detail in the following 
subsections: 

25 A. Isolated Nucleic Acid Molecules 

One aspect of the invention pertains to isolated nucleic acid molecules that 
encode MR polypeptides or biologically active portions thereof, as well as nucleic acid 
fragments sufficient for use as hybridization probes or primers for the identification or 
amplification of MR-encoding nucleic acid {e.g^ MR DNA). As used herein, the term 

30 "nucleic acid molecule" is intended to include DNA molecules (e.g., cDNA or genomic 
DNA) and RNA molecules (e.g., mRNA) and analogs of the DNA or RNA generated 
using nucleotide analogs. This term also encompasses untranslated sequence located at 
both the 3' and 5' ends of the coding region of the gene: at least about 100 nucleotides 
of sequence upstream from the 5 ' end of the coding region and at least about 20 

35 nucleotides of sequence downstream from the 3 'end of the coding region of the gene. 
The nucleic acid molecule can be single-stranded or double-stranded, but preferably is 
double-stranded DNA. An "isolated" nucleic acid molecule is one which is separated 
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from other nucleic acid molecules which are present in the natural source of the nucleic 
acid. Preferably, an "isolated" nucleic acid is free of sequences which naturally flank 
the nucleic acid (i.e., sequences located at the 5* and 3' ends of the nucleic acid) in the 
genomic DNA of the organism from which the nucleic acid is derived. For example, in 
5 various embodiments, the isolated MR nucleic acid molecule can contain less than about 
5 kb, 4kb, 3kb, 2kb, 1 kb, 0.5 kb or 0.1 kb of nucleotide sequences which naturally flank 
the nucleic acid molecule in genomic DNA of the cell from which the nucleic acid is 
derived (e.g,, a C. glutamicum cell). Moreover, an "isolated" nucleic acid molecule, 
such as a DNA molecule, can be substantially free of other cellular material, or culture 

10 medium when produced by recombinant techniques, or chemical precursors or other 
chemicals when chemically synthesized. 

A nucleic acid molecule of the present invention, e.g.^ a nucleic acid molecule 
having a nucleotide sequence of Appendix A, or a portion thereof, can be isolated using 
standard molecular biology techniques and the sequence information provided herein. 

15 For example, a C. glutamicum MR DNA can be isolated from a C glutamicum library 
using all or portion of one of the sequences of Appendix A as a hybridization probe and 
standard hybridization techniques (e.g., as described in Sambrook, J., Fritsh, E. F., and 
Maniatis, T. Molecular Cloning: A Laboratory Manual 2nd, ed., Cold Spring Harbor 
Laboratory, Cold Spring Harbor Laboratory Press, Cold Spring Harbor, NY, 1989). 

20 Moreover, a nucleic acid molecule encompassing all or a portion of one of the sequences 
of Appendix A can be isolated by the polymerase chain reaction using oligonucleotide 
primers designed based upon this sequence {e,g.^ a nucleic acid molecule encompassing 
all or a portion of one of the sequences of Appendix A can be isolated by the polymerase 
chain reaction using oligonucleotide primers designed based upon this same sequence of 

25 Appendix A). For example, mRNA can be isolated from normal endothelial cells (e.g., 
by the guanidinium-thiocyanate extraction procedure of Chirgwin et al (1979) 
Biochemistry 18: 5294-5299) and DNA can be prepared using reverse transcriptase {e.g., 
Moloney MLV reverse transcriptase, available from Gibco/BRL, Bethesda, MD; or 
AMV reverse transcriptase, available from Seikagaku America, Inc., St. Petersburg, FL). 

30 Synthetic oligonucleotide primers for polymerase chain reaction amplification can be 
designed based upon one of the nucleotide sequences shown in Appendix A. A nucleic 
acid of the invention can be amplified using cDNA or, alternatively, genomic DNA, as a 
template and appropriate oligonucleotide primers according to standard PCR 
amplification techniques. The nucleic acid so amplified can be cloned into an 

35 appropriate vector and characterized by DNA sequence analysis. Furthermore, 
oligonucleotides corresponding to an MR nucleotide sequence can be prepared by 
standard synthetic techniques, e.g.^ using an automated DNA synthesizer. 
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In a preferred embodiment, an isolated nucleic acid molecule of the invention 
comprises one of the nucleotide sequences shown in Appendix A. The sequences of 
Appendix A correspond to the Corynebacterium glutamicum MR DNAs of the 
invention. This DNA comprises sequences encoding MR proteins {i.e., the "coding 
5 region", indicated in each sequence in Appendix A), as well as 5* untranslated sequences 
and 3' untranslated sequences, also indicated in Appendix A. Alternatively, the nucleic 
acid molecule can comprise only the coding region of any of the sequences in Appendix 
A. 

For the purposes of this application, it will be understood that each of the 
10 sequences set forth in Appendix A has an identifying RXA, RXN, or RXS number 
having the designation "RXA", "RXN", or "RXS" followed by 5 digits {i.e., 
RXA00603, RXN03181, or RXS00686). Each of these sequences comprises up to three 
parts: a 5' upstream region, a coding region, and a downstream region. Each of these 
three regions is identified by the same RXA, RXN, or RXS designation to eliminate 
?P 15 confusion. The recitation "one of the sequences in Appendix A", then, refers to any of 

fy the sequences in Appendix A, which may be distinguished by their differing RXA, 

W RXN, or RXS designations. The coding region of each of these sequences is translated 

2 into a corresponding amino acid sequence, which is set forth in Appendix B. The 

s. sequences of Appendix B are identified by the same RXA, RXN, or RXS designations 

y 20 as Appendix A, such that they can be readily correlated. For example, the amino acid . 

f y sequences in Appendix B designated RXA00603 , RXN03 1 8 1 , and RXS00686 are 

y translations of the coding regions of the nucleotide sequence of nucleic acid molecules 

% RXA00603, RXN03181, and RXS00686, respectively, in Appendix A. Each of the 

RXA, RXN, and RXS nucleotide and amino acid sequences of the invention has also 
25 been assigned a SEQ ID NO, as indicated in Table 1. For example, as shown in Table 1, 
the nucleotide sequence of RXA00603 is SEQ ID NO:5 and the amino acid sequence of 
RXA00603 is SEQ ID NO: 6. 

Several of the genes of the invention are "F-designated genes". An F-designated 
gene includes those genes set forth in Table 1 which have an 'F' in front of the RXA, 
30 RXN, or RXS designation. For example, SEQ ID NO:3, designated, as indicated on 

Table 1, as "F RXA02880", is an F-designated gene, as are SEQ ID NOs: 21, 27, and 33 
(designated on Table 1 as "F RXA02493", "F RXA00291", and "F RXA00651", 
respectively). 

In one embodiment, the nucleic acid molecules of the present invention are not 
35 intended to include those compiled in Table 2. In the case of the dapD gene, a sequence 
for this gene was published in Wehrmaim, A., et al (1998) J. Bacterial, 180(12): 3159- 
3 165. However, the sequence obtained by the inventors of the present application is 
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significantly longer than the published version. It is believed that the published version 
relied on an incorrect start codon, and thus represents only a fragment of the actual 
coding region. 

In another preferred embodiment, an isolated nucleic acid molecule of the 
5 invention comprises a nucleic acid molecule w^hich is a complement of one of the 
nucleotide sequences shown in Appendix A, or a portion thereof. A nucleic acid 
molecule which is complementary to one of the nucleotide sequences shown in 
Appendix A is one which is sufficiently complementary to one of the nucleotide 
sequences shovm in Appendix A such that it can hybridize to one of the nucleotide 

10 sequences showoi in Appendix A, thereby forming a stable duplex. 

In still another preferred embodiment, an isolated nucleic acid molecule of the 
invention comprises a nucleotide sequence which is at least about 50%, 51%, 52%, 53%, 
54%, 55%, 56%, 57%, 58%>, 59%, or 60%, preferably at least about 61%, 62%, 63%, 
64%, 65%, 66%>, 67%, 68%, 69%, or 70%, more preferably at least about 71%, 72%, 

15 73%, 74%, 75%), 76%, 77%, 78%>, 79%, or 80%, 81%, 82%, 83%, 84%, 85%, 86%, 
87%, 88%, 89%), or 90%, or 91%, 92%, 93%, 94%, and even more preferably at least 
about 95%, 96%, 97%, 98%, 99% or more homologous to a nucleotide sequence shovm 
in Appendix A, or a portion thereof. Ranges and identity values intermediate to the 
above-recited ranges, (e.g., 70-90% identical or 80-95% identical) are also intended to 

20 be encompassed by the present invention. For example, ranges of identity values using a 
combination of any of the above values recited as upper and/or lower limits are intended 
to be included. In an additional preferred embodiment, an isolated nucleic acid 
molecule of the invention comprises a nucleotide sequence which hybridizes, e.g., 
hybridizes under stringent conditions, to one of the nucleotide sequences shown in 

25 Appendix A, or a portion thereof. 

Moreover, the nucleic acid molecule of the invention can comprise only a 
portion of the coding region of one of the sequences in Appendix A, for example a 
fragment which can be used as a probe or primer or a fragment encoding a biologically 
active portion of an MR protein. The nucleotide sequences determined from the cloning 

30 of the MR genes from C glutamicum allows for the generation of probes and primers 
designed for use in identifying and/or cloning MR homologues in other cell types and 
organisms, as well as MR homologues from other Corynebacteria or related species. 
The probe/primer typically comprises substantially purified oligonucleotide. The 
oligonucleotide typically comprises a region of nucleotide sequence that hybridizes 

35 under stringent conditions to at least about 12, preferably about 25, more preferably 

about 40, 50 or 75 consecutive nucleotides of a sense strand of one of the sequences set 
forth in Appendix A, an anti-sense sequence of one of the sequences set forth in 
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Appendix A, or naturally occurring mutants thereof. Primers based on a nucleotide 
sequence of Appendix A can be used in PGR reactions to clone MR homologues. 
Probes based on the MR nucleotide sequences can be used to detect transcripts or 
genomic sequences encoding the same or homologous proteins. In preferred 
5 embodiments, the probe further comprises a label group attached thereto, e.g. the label 
group can be a radioisotope, a fluorescent compound, an enzyme, or an enzyme co- 
factor. Such probes can be used as a part of a diagnostic test kit for identifying cells 
which misexpress an MR protein, such as by measuring a level of an MR-encoding 
nucleic acid in a sample of cells, e.g., detecting MR mRNA levels or determining 

10 whether a genomic MR gene has been mutated or deleted. 

In one embodiment, the nucleic acid molecule of the invention encodes a protein 
or portion thereof which includes an amino acid sequence which is sufficiently 
homologous to an amino acid sequence of Appendix B such that the protein or portion 
thereof maintains the ability to transcriptionally, translationally, or posttranslationally 

15 regulate a metabolic pathway in C glutamicum. As used herein, the language 

"sufficiently homologous" refers to proteins or portions thereof which have amino acid 
sequences which include a minimum number of identical or equivalent {e.g, , an amino 
acid residue which has a similar side chain as an amino acid residue in one of the 
sequences of Appendix B) amino acid residues to an amino acid sequence of Appendix 

20 B such that the protein or portion thereof is able to transcriptionally, translationally, or 
posttranslationally regulate a metabolic pathway in C. glutamicum. Protein members of 
such metabolic pathways, as described herein, may function to regulate the biosynthesis 
or degradation of one or more fine chemicals. Examples of such activities are also 
described herein. Thus, "the function of an MR protein" contributes to the overall 

25 regulation of one or more fine chemical metabolic pathway, or contributes, either 

directly or indirectly, to the yield, production, and/or efficiency of production of one or 
more fine chemicals. Examples of MR protein activities are set forth in Table 1. 

In another embodiment, the protein is at least about 50-60%, preferably at least 
about 60-70%, and more preferably at least about 70-80%, 80-90%, 90-95%, and most 

30 preferably at least about 96%, 97%, 98%, 99% or more homologous to an entire amino 
acid sequence of Appendix B. 

Portions of proteins encoded by the MR nucleic acid molecules of the invention 
are preferably biologically active portions of one of the MR proteins. As used herein, 
the term "biologically active portion of an MR protein" is intended to include a portion, 

35 e.g., a domain/motif, of an MR protein that transcriptionally, translationally, or 

posttranslationally regulates a metabolic pathway in C glutamicum, or has an activity as 
set forth in Table 1. To determine whether an MR protein or a biologically active 
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portion thereof can transcriptionally, translationally, or posttranslationally regulate a 
metabolic pathway in C glutomicum, an assay of enzymatic activity may be performed. 
Such assay methods are well known to those of ordinary skill in the art, as detailed in 
Example 8 of the Exemplification. 
5 Additional nucleic acid fragments encoding biologically active portions of an 

MR protein can be prepared by isolating a portion of one of the sequences in Appendix 
B, expressing the encoded portion of the MR protein or peptide (e.g., by recombinant 
expression in vitro) and assessing the activity of the encoded portion of the MR protein 
or peptide. 

1 0 The invention further encompasses nucleic acid molecules that differ from one of 

the nucleotide sequences shown in Appendix A (and portions thereof) due to degeneracy 
of the genetic code and thus encode the same MR protein as that encoded by the 
nucleotide sequences shown in Appendix A. In another embodiment, an isolated nucleic 
acid molecule of the invention has a nucleotide sequence encoding a protein having an 

15 amino acid sequence shown in Appendix B. In a still further embodiment, the nucleic 
acid molecule of the invention encodes a full length C glutamicum protein which is 
substantially homologous to an amino acid sequence of Appendix B (encoded by an 
open reading frame shown in Appendix A). 

It will be understood by one of ordinary skill in the art that in one embodiment 

20 the sequences of the invention are not meant to include the sequences of the prior art, 

such as those Genbank sequences set forth in Tables 2 or 4 which were available prior to 
the present invention. In one embodiment, the invention includes nucleotide and amino 
acid sequences having a percent identity to a nucleotide or amino acid sequence of the 
invention which is greater than that of a sequence of the prior art (e.g., a Genbank 

25 sequence (or the protein encoded by such a sequence) set forth in Tables 2 or 4). For 
example, the invention includes a nucleotide sequence which is greater than and/or at 
least 40% identical to the nucleotide sequence designated POCA00603 (SEQ ID NO: 5), a 
nucleotide sequence which is greater than and/or at least 55% identical to the nucleotide 
sequence designated RXA00129 (SEQ ID NO:29), and a nucleotide sequence which is 

30 greater than and/or at least 40% identical to the nucleotide sequence designated 

RXA00006 (SEQ ID NO:35). One of ordinary skill in the art would be able to calculate 
the lower threshold of percent identity for any given sequence of the invention by 
examining the GAP-calculated percent identity scores set forth in Table 4 for each of the 
three top hits for the given sequence, and by subtracting the highest GAP-calculated 

35 percent identity from 100 percent. One of ordinary skill in the art will also appreciate 
that nucleic acid and amino acid sequences having percent identities greater than the 
lower threshold so calculated (e.g., at least 50%, 51%, 52%, 53%, 54%, 55%, 56%, 
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57%, 58%, 59%, or 60%, preferably at least about 61%, 62%, 63%, 64%, 65%, 66%, 
67%, 68%, 69%, or 70%, more preferably at least about 71%, 72%, 73%, 74%, 75%, 
76%, 77%, 78%, 79%, or 80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, or 
90%, or 91%, 92%, 93%, 94%, and even more preferably at least about 95%, 96%, 97%, 
5 98%, 99% or more identical) are also encompassed by the invention. 

In addition to the C glutamicum MR nucleotide sequences shown in Appendix 
A, it will be appreciated by those of ordinary skill in the art that DNA sequence 
polymorphisms that lead to changes in the amino acid sequences of MR proteins may 
exist within a population {e.g., the C. glutamicum population). Such genetic 

1 0 polymorphism in the MR gene may exist among individuals within a population due to 
natural variation. As used herein, the terms "gene" and "recombinant gene" refer to 
nucleic acid molecules comprising an open reading frame encoding an MR protein, 
preferably a C glutamicum MR protein. Such natural variations can typically result in 
1-5% variance in the nucleotide sequence of the MR gene. Any and all such nucleotide 

15 variations and resulting amino acid polymorphisms in MR that are the result of natural 
variation and that do not alter the functional activity of MR proteins are intended to be 
within the scope of the invention. 

Nucleic acid molecules corresponding to natural variants and non-C. glutamicum 
homologues of the C. glutamicum MR DNA of the invention can be isolated based on 

20 their homology to the C glutamicum MR nucleic acid disclosed herein using the C. 
glutamicum DNA, or a portion thereof, as a hybridization probe according to standard 
hybridization techniques under stringent hybridization conditions. Accordingly, in 
another embodiment, an isolated nucleic acid molecule of the invention is at least 1 5 
nucleotides in length and hybridizes under stringent conditions to the nucleic acid 

25 molecule comprising a nucleotide sequence of Appendix A. In other embodiments, the 
nucleic acid is at least 30, 50, 100, 250 or more nucleotides in length. As used herein, 
the term "hybridizes under stringent conditions" is intended to describe conditions for 
hybridization and washing under which nucleotide sequences at least 60% homologous 
to each other typically remain hybridized to each other. Preferably, the conditions are 

30 such that sequences at least about 65%, more preferably at least about 70%, and even 
more preferably at least about 75% or more homologous to each other typically remain 
hybridized to each other. Such stringent conditions are known to those of ordinary skill 
in the art and can be found in Current Protocols in Molecular Biology, John Wiley & 
Sons, N.Y. (1989), 6.3.1-6.3.6. A preferred, non-limiting example of stringent 

35 hybridization conditions are hybridization in 6X sodium chloride/sodium citrate (SSC) 
at about 45*=*C, followed by one or more washes in 0.2 X SSC, 0.1% SDS at 50-65°C. 
Preferably, an isolated nucleic acid molecule of the invention that hybridizes under 
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stringent conditions to a sequence of Appendix A corresponds to a naturally-occurring 
nucleic acid molecule. As used herein, a "naturally-occurring" nucleic acid molecule 
refers to an RNA or DNA molecule having a nucleotide sequence that occurs in nature 
{e.g., encodes a natural protein). In one embodiment, the nucleic acid encodes a natural 
5 C glutamicum MR protein. 

In addition to naturally-occurring variants of the MR sequence that may exist in 
the population, one of ordinary skill in the art will further appreciate that changes can be 
introduced by mutation into a nucleotide sequence of Appendix A, thereby leading to 
changes in the amino acid sequence of the encoded MR protein, without altering the 

1 0 functional ability of the MR protein. For example, nucleotide substitutions leading to 
amino acid substitutions at "non-essential" amino acid residues can be made in a 
sequence of Appendix A. A "non-essential" amino acid residue is a residue that can be 
altered from the wild-type sequence of one of the MR proteins (Appendix B) without 
altering the activity of said MR protein, whereas an "essential" amino acid residue is 

15 required for MR protein activity. Other amino acid residues, however, {e.g., those that 
are not conserved or only semi-conserved in the domain having MR activity) may not be 
essential for activity and thus are likely to be amenable to alteration without altering MR 
activity. 

Accordingly, another aspect of the invention pertains to nucleic acid molecules 
20 encoding MR proteins that contain changes in amino acid residues that are not essential 
for MR activity. Such MR proteins differ in amino acid sequence from a sequence 
contained in Appendix B yet retain at least one of the MR activities described herein. In 
one embodiment, the isolated nucleic acid molecule comprises a nucleotide sequence 
encoding a protein, wherein the protein comprises an amino acid sequence at least about 
25 50% homologous to an amino acid sequence of Appendix B and is capable of 

transcriptionally, translationally, or posttranslationally regulating a metabolic pathway 
in C glutamicum, or has one or more activities set forth in Table 1 . Preferably, the 
protein encoded by the nucleic acid molecule is at least about 50-60% homologous to 
one of the sequences in Appendix B, more preferably at least about 60-70% homologous 
30 to one of the sequences in Appendix B, even more preferably at least about 70-80%, 80- 
90%, 90-95% homologous to one of the sequences in Appendix B, and most preferably 
at least about 96%, 97%, 98%, or 99% homologous to one of the sequences in Appendix 
B. 

To determine the percent homology of two amino acid sequences {e.g., one of 
35 the sequences of Appendix B and a mutant form thereof) or of two nucleic acids, the 
sequences are aligned for optimal comparison purposes {e.g., gaps can be introduced in 
the sequence of one protein or nucleic acid for optimal alignment with the other protein 
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or nucleic acid). The amino acid residues or nucleotides at corresponding amino acid 
positions or nucleotide positions are then compared. When a position in one sequence 
{e.g., one of the sequences of Appendix B) is occupied by the same amino acid residue 
or nucleotide as the corresponding position in the other sequence (e.g., a mutant form of 
5 the sequence selected from Appendix B), then the molecules are homologous at that 
position (i.e., as used herein amino acid or nucleic acid "homology" is equivalent to 
amino acid or nucleic acid "identity"). The percent homology between the two 
sequences is a function of the number of identical positions shared by the sequences 
(i.e., % homology = # of identical positions/total # of positions x 100). 

10 An isolated nucleic acid molecule encoding an MR protein homologous to a 

protein sequence of Appendix B can be created by introducing one or more nucleotide 
substitutions, additions or deletions into a nucleotide sequence of Appendix A such that 
one or more amino acid substitutions, additions or deletions are introduced into the 
encoded protein. Mutations can be introduced into one of the sequences of Appendix A 

15 by standard techniques, such as site-directed mutagenesis and PCR-mediated 

mutagenesis. Preferably, conservative amino acid substitutions are made at one or more 
predicted non-essential amino acid residues. A "conservative amino acid substitution" is 
one in which the amino acid residue is replaced with an amino acid residue having a 
similar side chain. Families of amino acid residues having similar side chains have been 

20 defined in the art. These families include amino acids with basic side chains (e.g., 
lysine, arginine, histidine), acidic side chains (e.g., aspartic acid, glutamic acid), 
uncharged polar side chains (e.g, glycine, asparagine, glutamine, serine, threonine, 
tyrosine, cysteine), nonpolar side chains (e.g., alanine, valine, leucine, isoleucine, 
proline, phenylalanine, methionine, tryptophan), beta-branched side chains (e.g., 

25 threonine, valine, isoleucine) and aromatic side chains (e.g., tyrosine, phenylalanine, 
tryptophan, histidine). Thus, a predicted nonessential amino acid residue in an MR 
protein is preferably replaced with another amino acid residue from the same side chain 
family. Alternatively, in another embodiment, mutations can be introduced randomly 
along all or part of an MR coding sequence, such as by saturation mutagenesis, and the 

30 resultant mutants can be screened for an MR activity described herein to identify 
mutants that retain MR activity. Following mutagenesis of one of the sequences of 
Appendix A, the encoded protein can be expressed recombinantly and the activity of the 
protein can be determined using, for example, assays described herein (see Example 8 of 
the Exemplification). 

35 In addition to the nucleic acid molecules encoding MR proteins described above, 

another aspect of the invention pertains to isolated nucleic acid molecules which are 
antisense thereto. An "antisense" nucleic acid comprises a nucleotide sequence which is 
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complementary to a "sense" nucleic acid encoding a protein, e.g,^ complementary to the 
coding strand of a double-stranded DNA molecule or complementary to an mRNA 
sequence. Accordingly, an antisense nucleic acid can hydrogen bond to a sense nucleic 
acid. The antisense nucleic acid can be complementary to an entire MR coding strand, 
5 or to only a portion thereof. In one embodiment, an antisense nucleic acid molecule is 
antisense to a "coding region" of the coding strand of a nucleotide sequence encoding an 
MR protein. The term "coding region" refers to the region of the nucleotide sequence 
comprising codons which are translated into amino acid residues (e.g., the entire 
codingregion of SEQ ID NO : 1 (RXN03 181) comprises nucleotides 1 te444). In another 

^' 10 embodiment, the antisense nucleic acid molecule is antisense to a "noncoding region" of 

the coding strand of a nucleotide sequence encoding MR. The term "noncoding region" 
refers to 5' and 3' sequences which flank the coding region that are not translated into 
^ amino acids (/.e., also referred to as 5' and 3' untranslated regions). 

Given the coding strand sequences encoding MR disclosed herein (e.g., the 
1 5 sequences set forth in Appendix A), antisense nucleic acids of the invention can be 

ry designed according to the rules of Watson and Crick base pairing. The antisense nucleic 

acid molecule can be complementary to the entire coding region of MR mRNA, but 

_^ more preferably is an oligonucleotide which is antisense to only a portion of the coding 

or noncoding region of MR mRNA. For example, the antisense oligonucleotide can be 

J2 20 complementary to the region surrounding the translation start site of MR mRNA. An 

flj antisense oligonucleotide can be, for example, about 5, 10, 15, 20, 25, 30, 35, 40, 45 or 

50 nucleotides in length. An antisense nucleic acid of the invention can be constructed 

?R using chemical synthesis and enzymatic ligation reactions using procedures known in 

the art. For example, an antisense nucleic acid (e.g.^ an antisense oligonucleotide) can 
25 be chemically synthesized using naturally occurring nucleotides or variously modified 
nucleotides designed to increase the biological stability of the molecules or to increase 
the physical stability of the duplex formed between the antisense and sense nucleic 
acids, e.g., phosphorothioate derivatives and acridine substituted nucleotides can be 
used. Examples of modified nucleotides which can be used to generate the antisense 
30 nucleic acid include 5-fluorouracil, 5-bromouracil, 5-chlorouracil, 5-iodouracil, 
hypoxanthine, xanthine, 4-acetylcytosine, 5-(carboxyhydroxylmethyl) uracil, 5- 
carboxymethylaminomethyl-2-thiouridine, 5-carboxymethylaminomethyluracil, 
dihydrouracil, beta-D-galactosylqueosine, inosine, N6-isopentenyladenine, 1- 
methylguanine, 1-methylinosine, 2,2-dimethylguanine, 2-methyladenine, 2- 
35 methylguanine, 3-methylcytosine, 5-methyIcytosine, N6-adenine, 7-methylguanine, 5- 
methylaminomethyluracil, 5-methoxyaminomethyl-2-thiouracil, beta-D- 
mannosylqueosine, 5'-methoxycarboxymethyluracil, 5-methoxyuracil, 2-methylthio-N6- 
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isopentenyladenine, uracil-5-oxyacetic acid (v), wybutoxosine, pseudouracil, queosine, 
2-thiocytosine5 5-methyl-2-thiouracil, 2-thiouracil, 4-thiouracil, 5-methyluracil, uracil-5- 
oxyacetic acid methylester, uracil-5-oxy acetic acid (v), 5-methyl-2-thiouracil, 3-(3- 
amino-3-N-2-carboxypropyl) uracil, (acp3)w, and 2,6-diaminopurine. Alternatively, the 
5 antisense nucleic acid can be produced biologically using an expression vector into 
which a nucleic acid has been subcloned in an antisense orientation (i.e., RNA 
transcribed from the inserted nucleic acid will be of an antisense orientation to a target 
nucleic acid of interest, described further in the following subsection). 

The antisense nucleic acid molecules of the invention are typically administered 

10 to a cell or generated in situ such that they hybridize with or bind to cellular mRNA 
and/or genomic DNA encoding an MR protein to thereby inhibit expression of the 
protein, e.g., by inhibiting transcription and/or translation. The hybridization can be by 
conventional nucleotide complementarity to form a stable duplex, or, for example, in the 
case of an antisense nucleic acid molecule which binds to DNA duplexes, through 

15 specific interactions in the major groove of the double helix. The antisense molecule can 
be modified such that it specifically binds to a receptor or an antigen expressed on a 
selected cell surface, e.g., by linking the antisense nucleic acid molecule to a peptide or 
an antibody which binds to a cell surface receptor or antigen. The antisense nucleic acid 
molecule can also be delivered to cells using the vectors described herein. To achieve 

20 sufficient intracellular concentrations of the antisense molecules, vector constructs in 
which the antisense nucleic acid molecule is placed under the control of a strong 
prokaryotic, viral, or eukaryotic promoter are preferred. 

In yet another embodiment, the antisense nucleic acid molecule of the invention 
is an a-anomeric nucleic acid molecule. An a-anomeric nucleic acid molecule forms 

25 specific double-stranded hybrids with complementary RNA in which, contrary to the 
usual P-units, the strands run parallel to each other (Gaultier et al. (1987) Nucleic Acids. 
Res. 15:6625-6641). The antisense nucleic acid molecule can also comprise a 2'-o- 
methylribonucleotide (Inoue et al. (19S7) Nucleic Acids Res. 15:6131-6148) or a 
chimeric RNA-DNA analogue (Inoue et al (1987) FEES Lett. 215:327-330). 

30 In still another embodiment, an antisense nucleic acid of the invention is a 

ribozyme. Ribozymes are catalytic RNA molecules with ribonuclease activity which are 
capable of cleaving a single-stranded nucleic acid, such as an mRNA, to which they 
have a complementary region. Thus, ribozymes {e.g, hammerhead ribozymes 
(described in Haselhoff and Gerlach (1988) Nature 334:585-591)) can be used to 

35 catalytically cleave MR mRNA transcripts to thereby inhibit translation of MR mRNA. 
A ribozyme having specificity for an MR-encoding nucleic acid can be designed based 
upon the nucleotide sequence of an MR DNA disclosed herein {i.e., SEQ ID NO:l 
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(RXN03181 in Appendix A)). For example, a derivative of a Tetrahymena L-19 IVS 
RNA can be constructed in v^hich the nucleotide sequence of the active site is 
complementary to the nucleotide sequence to be cleaved in an MR-encoding mRNA. 
See, e.g., Cech et al U.S. Patent No. 4,987,071 and Cech et al U.S. Patent No. 
5 5,11 6,742. Alternatively, MR mRNA can be used to select a catalytic RNA having a 
specific ribonuclease activity from a pool of RNA molecules. See, e.g., Bartel, D. and 
Szostak, J. W. ( 1 993) Science 261:1411-1418. 

Altematively, MR gene expression can be inhibited by targeting nucleotide 
sequences complementary to the regulatory region of an MR nucleotide sequence {e.g., 
10 an MR promoter and/or enhancers) to form triple helical structures that prevent 

transcription of an MR gene in target cells. See generally, Helene, C. {\99\) Anticancer 
DrugDes. 6(6):569-84; Helene, C. et al. (1992) Ann. KY. Acad Sci. 660:27-36; and 
Maher, L.J. (1992) Bioassays 14(12):807-15. 

15 B. Recombinant Expression Vectors and Host Cells 

Another aspect of the invention pertains to vectors, preferably expression 
vectors, containing a nucleic acid encoding an MR protein (or a portion thereof). As 
used herein, the term "vector" refers to a nucleic acid molecule capable of transporting 
another nucleic acid to which it has been linked. One type of vector is a "plasmid", 

20 which refers to a circular double stranded DNA loop into which additional DNA 

segments can be ligated. Another type of vector is a viral vector, wherein additional 
DNA segments can be ligated into the viral genome. Certain vectors are capable of 
autonomous replication in a host cell into which they are introduced {e.g., bacterial 
vectors having a bacterial origin of replication and episomal mammalian vectors). Other 

25 vectors {e.g., non-episomal mammalian vectors) are integrated into the genome of a host 
cell upon introduction into the host cell, and thereby are replicated along with the host 
genome. Moreover, certain vectors are capable of directing the expression of genes to 
which they are operatively linked. Such vectors are referred to herein as "expression 
vectors". In general, expression vectors of utility in recombinant DNA techniques are 

30 often in the form of plasmids. In the present specification, "plasmid" and "vector" can 
be used interchangeably as the plasmid is the most commonly used form of vector. 
However, the invention is intended to include such other forms of expression vectors, 
such as viral vectors {e.g., replication defective retroviruses, adenoviruses and adeno- 
associated viruses), which serve equivalent fiinctions. 

35 The recombinant expression vectors of the invention comprise a nucleic acid of 

the invention in a form suitable for expression of the nucleic acid in a host cell, which 
means that the recombinant expression vectors include one or more regulatory 
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sequences, selected on the basis of the host cells to be used for expression, which is 
operatively linked to the nucleic acid sequence to be expressed. Within a recombinant 
expression vector, "operably linked" is intended to mean that the nucleotide sequence of 
interest is linked to the regulatory sequence(s) in a manner which allows for expression 
5 of the nucleotide sequence (e,g,, in an in vitro transcription/translation system or in a 
host cell when the vector is introduced into the host cell). The term "regulatory 
sequence" is intended to include promoters, enhancers and other expression control 
elements (e,g.y polyadenylation signals). Such regulatory sequences are described, for 
example, in Goeddel; Gene Expression Technology: Methods in Enzymology 185, 

10 Academic Press, San Diego, CA (1990). Regulatory sequences include those which 
direct constitutive expression of a nucleotide sequence in many types of host cell and 
those which direct expression of the nucleotide sequence only in certain host cells. 
Preferred regulatory sequences are, for example, promoters such as cos-, tac-, trp-, tet-, 
trp-tet-, Ipp-, lac-, Ipp-lac-, lad''-, T7-, T5-, T3-, gal-, trc-, ara-, SP6-, amy, SP02, X,-Pr- 

15 or X- Pl, which are used preferably in bacteria. Additional regulatory sequences are, for 
example, promoters from yeasts and fungi, such as ADCl, MFa, AC, P-60, CYCl, 
GAPDH, TEF, rp28, ADH, promoters from plants such as CaMV/35S, SSU, OCS, lib4, 
usp, STLSl, B33, nos or ubiquitin- or phaseolin-promoters. It is also possible to use 
artificial promoters. It will be appreciated by one of ordinary skill in the art that the 

20 design of the expression vector can depend on such factors as the choice of the host cell 
to be transformed, the level of expression of protein desired, etc. The expression vectors 
of the invention can be introduced into host cells to thereby produce proteins or 
peptides, including fusion proteins or peptides, encoded by nucleic acids as described 
herein {e.g^ MR proteins, mutant forms of MR proteins, fusion proteins, etc.). 

25 The recombinant expression vectors of the invention can be designed for 

expression of MR proteins in prokaryotic or eukaryotic cells. For example, MR genes 
can be expressed in bacterial cells such as C glutamicum, insect cells (using baculovirus 
expression vectors), yeast and other fungal cells (see Romanos, M.A. et al (1992) 
"Foreign gene expression in yeast: a review". Yeast 8: 423-488; van den Hondel, 

30 C.A. M.J.J, et al. (1991) "Heterologous gene expression in filamentous fungi" in: More 
Gene Manipulations in Fungi, J.W. Bennet & L.L. Lasure, eds., p. 396-428; Academic 
Press: San Diego; and van den Hondel, C. A.M. J. J. & Punt, P.J. (1991) "Gene transfer 
systems and vector development for filamentous fungi, in: Applied Molecular Genetics 
of Fungi, Peberdy, J.F. et al, eds., p. 1-28, Cambridge University Press: Cambridge), 

35 algae and multicellular plant cells (see Schmidt, R. and Willmitzer, L. (1988) High 
efficiency Agrobacterium tumefaciens -mediated transformation of Arabidopsis 
thaliana leaf and cotyledon explants" Plant Cell Rep. : 583-586), or mammalian cells. 
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Suitable host cells are discussed further in Goeddel, Gene Expression Technology: 
Methods in Enzymology 185, Academic Press, San Diego, CA (1990). Alternatively, the 
recombinant expression vector can be transcribed and translated in vitro, for example 
using T7 promoter regulatory sequences and T7 polymerase. 
5 Expression of proteins in prokaryotes is most often carried out with vectors 

containing constitutive or inducible promoters directing the expression of either fusion 
or non-fusion proteins. Fusion vectors add a number of amino acids to a protein 
encoded therein, usually to the amino terminus of the recombinant protein. Such fusion 
vectors typically serve three purposes: 1) to increase expression of recombinant protein; 

10 2) to increase the solubility of the recombinant protein; and 3) to aid in the purification 
of the recombinant protein by acting as a ligand in affinity purification. Often, in fusion 
expression vectors, a proteolytic cleavage site is introduced at the junction of the fusion 
moiety and the recombinant protein to enable separation of the recombinant protein 
from the fusion moiety subsequent to purification of the fusion protein. Such enzymes, 

1 5 and their cognate recognition sequences, include Factor Xa, thrombin and enterokinase. 

Typical fusion expression vectors include pGEX (Pharmacia Biotech Inc; Smith, 
D.B. and Johnson, K.S. (1988) Gene 67:31-40), pMAL (New England Biolabs, Beverly, 
MA) and pRIT5 (Pharmacia, Piscataway, NJ) which fuse glutathione S-transferase 
(GST), maltose E binding protein, or protein A, respectively, to the target recombinant 

20 protein. In one embodiment, the coding sequence of the MR protein is cloned into a 
pGEX expression vector to create a vector encoding a fusion protein comprising, from 
the N-terminus to the C-terminus, GST-thrombin cleavage site-X protein. The fusion 
protein can be purified by affinity chromatography using glutathione-agarose resin. 
Recombinant MR protein unfused to GST can be recovered by cleavage of the fusion 

25 protein with thrombin. 

Examples of suitable inducible non-fusion E. coli expression vectors include 
pTrc (Amann et al, (1988) Gene 69:301-315) pLG338, pACYC184, pBR322, pUC18, 
pUC19, pKC30, pRep4, pHSl, pHS2, pPLc236, pMBL24, pLG200, pUR290, pIN- 
IIIl 13-Bl, ?Lgtl 1, pBdCl, and pET 1 Id (Studier et al. Gene Expression Technology: 

30 Methods in Enzymology 185, Academic Press, San Diego, California (1990) 60-89; and 
Pouwels et al, eds. (1985) Cloning Vectors. Elsevier: New York IBSN 0 444 904018). 
Target gene expression from the pTrc vector relies on host RNA polymerase 
transcription from a hybrid trp-lac fusion promoter. Target gene expression from the 
pET 1 Id vector relies on transcription from a T7 gnlO-lac fusion promoter mediated by 

35 a coexpressed viral RNA polymerase (T7 gnl). This viral polymerase is supplied by 
host strains BL21(DE3) or HMS174(DE3) from a resident A. prophage harboring a T7 
gnl gene under the transcriptional control of the lacUV 5 promoter. For transformation 
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of other varieties of bacteria, appropriate vectors may be selected. For example, the 
plasmids pIJlOl, pIJ364, pIJ702 and pIJ361 are known to be useful in transforming 
Streptomyces, while plasmids pUBl 10, pC194, or pBD214 are suited for transformation 
of Bacillus species. Several plasmids of use in the transfer of genetic information into 
5 Corynebacterium include pHM1519, pBLl, pSA77, or pAJ667 (Pouwels et aL, eds. 
(1985) Cloning Vectors. Elsevier: New York IBSN 0 444 90401 8).One strategy to 
maximize recombinant protein expression is to express the protein in a host bacteria 
with an impaired capacity to proteolytically cleave the recombinant protein (Gottesman, 
S., Gene Expression Technology: Methods in Enzymology 185, Academic Press, San 

10 Diego, California (1990) 1 19-128). Another strategy is to alter the nucleic acid 
sequence of the nucleic acid to be inserted into an expression vector so that the 
individual codons for each amino acid are those preferentially utilized in the bacterium 
chosen for expression, such as C. glutamicum (Wada et ai (1992) Nucleic Acids Res. 
20:21 1 1-21 18). Such alteration of nucleic acid sequences of the invention can be 

15 carried out by standard DNA synthesis techniques. 

In another embodiment, the MR protein expression vector is a yeast expression 
vector. Examples of vectors for expression in yeast S. cerevisiae include pYepSecl 
(Baldari, et al, (1987) Emho J. 6:229-234), 2 ^i, pAG-1, Yep6, Yepl3, pEMBLYe23, 
pMFa (Kurjan and Herskowitz, (1982) Cell 30:933-943), pJRY88 (Schultz et al, (1987) 

20 Gene 54:1 13-123), and pYES2 (Invitrogen Corporation, San Diego, CA). Vectors and 
methods for the construction of vectors appropriate for use in other fungi, such as the 
filamentous fungi, include those detailed in: van den Hondel, C.A.M.J.J. & Punt, P J. 
(1991) "Gene transfer systems and vector development for filamentous fungi, in: 
Applied Molecular Genetics of Fungi, J.F. Peberdy, et al, eds,, p. 1-28, Cambridge 

25 University Press: Cambridge, and Pouwels et al.^ eds. (1985) Cloning Vectors. Elsevier: 
New York (IBSN 0 444 904018). 

Altematively, the MR proteins of the invention can be expressed in insect cells 
using baculovirus expression vectors. Baculovirus vectors available for expression of 
proteins in cultured insect cells (eg., Sf 9 cells) include the pAc series (Smith et al 

30 (1983) Mol Cell Biol 3:2156-2165) and the pVL series (Lucklow and Summers (1989) 
Virology 170:31-39). 

In another embodiment, the MR proteins of the invention may be expressed in 
unicellular plant cells (such as algae) or in plant cells from higher plants {e.g, the 
spermatophytes, such as crop plants). Examples of plant expression vectors include 

35 those detailed in: Becker, D., Kemper, E., Schell, J. and Masterson, R. (1992) "New 
plant binary vectors with selectable markers located proximal to the left border", Plant 
Mol Biol 20: 1 195-1 197; and Bevan, M.W, (1984) ''B'meny Agrobacterium vectors for 
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plant transformation", Nucl Acid. Res, 12: 871 1-8721, and include pLGV23, pGHlac+, 
pBIN19, pAK2004, and pDH51 (Pouwels etaL, eds. (1985) Cloning Vectors. Elsevier: 
New York IBSN 0 444 904018). 

In yet another embodiment, a nucleic acid of the invention is expressed in 
5 mammalian cells using a mammalian expression vector. Examples of mammalian 
expression vectors include pCDM8 (Seed, B. (1987) Nature 329:840) and pMT2PC 
(Kaufman et al (1987) EMBOJ. 6:187-195). When used in mammalian cells, the 
expression vector's control functions are often provided by viral regulatory elements. 
For example, commonly used promoters are derived from polyoma. Adenovirus 2, 
10 cytomegalovirus and Simian Virus 40. For other suitable expression systems for both 
prokaryotic and eukaryotic cells see chapters 16 and 17 of Sambrook, J., Fritsh, E. F., 
and Maniatis, T. Molecular Cloning: A Laboratory Manual 2nd, ed., Cold Spring 
Harbor Laboratory, Cold Spring Harbor Laboratory Press, Cold Spring Harbor, NY, 
1989. 

15 In another embodiment, the recombinant mammalian expression vector is 

capable of directing expression of the nucleic acid preferentially in a particular cell type 
(e.g., tissue-specific regulatory elements are used to express the nucleic acid). Tissue- 
specific regulatory elements are known in the art. Non-limiting examples of suitable 
tissue-specific promoters include the albumin promoter (liver-specific; Pinkert et al. 

20 (1987) Genes Dev, 1 :268-277), lymphoid-specific promoters (Calame and Eaton (1988) 
Adv, Immunol. 43:235-275), in particular promoters of T cell receptors (Winoto and 
Baltimore (1989) EMBOJ. 8:729-733) and immunoglobulins (Banerji et al. (1983) Cell 
33:729-740; Queen and Baltimore (1983) Cell 33:741-748), neuron-specific promoters 
{e.g., the neurofilament promoter; Byrne and Ruddle (1989) PNAS 86:5473-5477), 

25 pancreas-specific promoters (Edlund et al. (1985) Science 230:912-916), and mammary 
gland-specific promoters (e.g., milk whey promoter; U.S. Patent No. 4,873,316 and 
European Application Publication No. 264,166). Developmentally-regulated promoters 
are also encompassed, for example the murine box promoters (Kessel and Gruss (1990) 
Science 249:374-379) and the a-fetoprotein promoter (Campes and Tilghman (1989) 

30 Genes Dev. 3:537-546). 

The invention further provides a recombinant expression vector comprising a 
DNA molecule of the invention cloned into the expression vector in an antisense 
orientation. That is, the DNA molecule is operatively linked to a regulatory sequence in 
a manner which allows for expression (by transcription of the DNA molecule) of an 

35 RNA molecule which is antisense to MR mRNA. Regulatory sequences operatively 
linked to a nucleic acid cloned in the antisense orientation can be chosen which direct 
the continuous expression of the antisense RNA molecule in a variety of cell types, for 
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instance viral promoters and/or enhancers, or regulatory sequences can be chosen which 
direct constitutive, tissue specific or cell type specific expression of antisense RNA. 
The antisense expression vector can be in the form of a recombinant plasmid, phagemid 
or attenuated virus in which antisense nucleic acids are produced under the control of a 
5 high efficiency regulatory region, the activity of which can be determined by the cell 
type into which the vector is introduced. For a discussion of the regulation of gene 
expression using antisense genes see Weintraiib, H. et al, Antisense RNA as a 
molecular tool for genetic analysis, Reviews - Trends in Genetics^ Vol. 1(1) 1986. 

Another aspect of the invention pertains to host cells into which a recombinant 

10 expression vector of the invention has been introduced. The terms "host cell" and 
"recombinant host cell" are used interchangeably herein. It is understood that such 
terms refer not only to the particular subject cell but to the progeny or potential progeny 
of such a cell. Because certain modifications may occur in succeeding generations due 
to either mutation or environmental influences, such progeny may not, in fact, be 

1 5 identical to the parent cell, but are still included within the scope of the term as used 
fy herein. 

Co A host cell can be any prokaryotic or eukaryotic cell. For example, an MR 

2 protein can be expressed in bacterial cells such as C. glutamicum, insect cells, yeast or 

s: mammalian cells (such as Chinese hamster ovary cells (CHO) or COS cells). Other 

2 20 suitable host cells are known to one of ordinary skill in the art. Microorganisms related 

ry to Corynebacterium glutamicum which may be conveniently used as host cells for the 

y nucleic acid and protein molecules of the invention are set forth in Table 3 . 

Vector DNA can be introduced into prokaryotic or eukaryotic cells via 
conventional transformation or transfection techniques. As used herein, the terms 
25 "transformation" and "transfection" are intended to refer to a variety of art-recognized 
techniques for introducing foreign nucleic acid {e.g., linear DNA or RNA {e.g., a 
linearized vector or a gene construct alone without a vector) or nucleic acid in the form 
of a vector {e.g., a plasmid, phage, phasmid, phagemid, transposon or other DNA) into a 
host cell, including calcium phosphate or calcium chloride co-precipitation, DEAE- 
30 dextran-mediated transfection, lipofection, or electroporation. Suitable methods for 
transforming or transfecting host cells can be found in Sambrook, et al. {Molecular 
Cloning: A Laboratory Manual. 2nd, ed. Cold Spring Harbor Laboratory, Cold Spring 
Harbor Laboratory Press, Cold Spring Harbor, NY, 1 989), and other laboratory manuals. 
For stable transfection of mammalian cells, it is known that, depending upon the 
35 expression vector and transfection technique used, only a small fraction of cells may 
integrate the foreign DNA into their genome. In order to identify and select these 
integrants, a gene that encodes a selectable marker {e.g., resistance to antibiotics) is 
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generally introduced into the host cells along with the gene of interest. Preferred 
selectable markers include those which confer resistance to drugs, such as G418, 
hygromycin and methotrexate. Nucleic acid encoding a selectable marker can be 
introduced into a host cell on the same vector as that encoding an MR protein or can be 
5 introduced on a separate vector. Cells stably transfected with the introduced nucleic 
acid can be identified by, for example, drug selection (e.g., cells that have incorporated 
the selectable marker gene will survive, while the other cells die). 

To create a homologous recombinant microorganism, a vector is prepared which 
contains at least a portion of an MR gene into which a deletion, addition or substitution 

10 has been introduced to thereby alter, e.g., functionally disrupt, the MR gene. Preferably, 
this MR gene is a Corynebacterium glutamiciim MR gene, but it can be a homologue 
from a related bacterium or even from a mammalian, yeast, or insect source. In a 
preferred embodiment, the vector is designed such that, upon homologous 
recombination, the endogenous MR gene is functionally disrupted (/.e., no longer 

15 encodes a functional protein; also referred to as a "knock out" vector). Alternatively, 
the vector can be designed such that, upon homologous recombination, the endogenous 
MR gene is mutated or otherwise altered but still encodes functional protein (e.g.^ the 
upstream regulatory region can be altered to thereby alter the expression of the 
endogenous MR protein). In the homologous recombination vector, the altered portion 

20 of the MR gene is flanked at its 5' and 3' ends by additional nucleic acid of the MR gene 
to allow for homologous recombination to occur between the exogenous MR gene 
carried by the vector and an endogenous MR gene in a microorganism. The additional 
flanking MR nucleic acid is of sufficient length for successful homologous 
recombination with the endogenous gene. Typically, several kilobases of flanking DNA 

25 (both at the 5' and V ends) are included in the vector (see e.g., Thomas, K.R., and 
Capecchi, M.R. (1987) Cell 51 : 503 for a description of homologous recombination 
vectors). The vector is introduced into a microorganism (e.g., by electroporation) and 
cells in which the introduced MR gene has homologously recombined with the 
endogenous MR gene are selected, using art-known techniques. 

30 In another embodiment, recombinant microorganisms can be produced which 

contain selected systems which allow for regulated expression of the introduced gene. 
For example, inclusion of an MR gene on a vector placing it under control of the lac 
operon permits expression of the MR gene only in the presence of IPTG. Such 
regulatory systems are well known in the art. 

35 In another embodiment, an endogenous MR gene in a host cell is disrupted {e,g, , 

by homologous recombination or other genetic means known in the art) such that 
expression of its protein product does not occur. In another embodiment, an endogenous 
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or introduced MR gene in a host cell has been altered by one or more point mutations, 
deletions, or inversions, but still encodes a functional MR protein. In still another 
embodiment, one or more of the regulatory regions (e.g., a promoter, repressor, or 
inducer) of an MR gene in a microorganism has been altered (e.g., by deletion, 
5 truncation, inversion, or point mutation) such that the expression of the MR gene is 
modulated. One of ordinary skill in the art will appreciate that host cells containing 
more than one of the described MR gene and protein modifications may be readily 
produced using the methods of the invention, and are meant to be included in the present 
invention. 

10 A host cell of the invention, such as a prokaryotic or eukaryotic host cell in 

culture, can be used to produce (/. e. , express) an MR protein. Accordingly, the 
invention further provides methods for producing MR proteins using the host cells of the 
invention. In one embodiment, the method comprises culturing the host cell of 
invention (into which a recombinant expression vector encoding an MR protein has been 

1 5 introduced, or into which genome has been introduced a gene encoding a wild-type or 
altered MR protein) in a suitable medium until MR protein is produced. In another 
embodiment, the method further comprises isolating MR proteins from the medium or 
the host cell. 

20 C Isolated MR Proteins 

Another aspect of the invention pertains to isolated MR proteins, and 
biologically active portions thereof. An "isolated" or "purified" protein or biologically 
active portion thereof is substantially free of cellular material when produced by 
recombinant DNA techniques, or chemical precursors or other chemicals when 

25 chemically synthesized. The language "substantially free of cellular material" includes 
preparations of MR protein in which the protein is separated from cellular components 
of the cells in which it is naturally or recombinantly produced. In one embodiment, the 
language "substantially free of cellular material" includes preparations of MR protein 
having less than about 30% (by dry weight) of non-MR protein (also referred to herein 

30 as a "contaminating protein"), more preferably less than about 20% of non-MR protein, 
still more preferably less than about 10% of non-MR protein, and most preferably less 
than about 5% non-MR protein. When the MR protein or biologically active portion 
thereof is recombinantly produced, it is also preferably substantially free of culture 
medium, i.e., culture medium represents less than about 20%, more preferably less than 

35 about 10%, and most preferably less than about 5% of the volume of the protein 
preparation. The language "substantially free of chemical precursors or other 
chemicals" includes preparations of MR protein in which the protein is separated from 
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chemical precursors or other chemicals which are involved in the synthesis of the 
protein. In one embodiment, the language "substantially free of chemical precursors or 
other chemicals" includes preparations of MR protein having less than about 30% (by 
dry weight) of chemical precursors or non-MR chemicals, more preferably less than 
5 about 20% chemical precursors or non-MR chemicals, still more preferably less than 
about 1 0% chemical precursors or non-MR chemicals, and most preferably less than 
about 5% chemical precursors or non-MR chemicals. In preferred embodiments, 
isolated proteins or biologically active portions thereof lack contaminating proteins from 
the same organism from which the MR protein is derived. Typically, such proteins are 

10 produced by recombinant expression of, for example, a C glutamicum MR protein in a 
microorganism such as C. glutamicum. 

An isolated MR protein or a portion thereof of the invention can 
transcriptionally, translationally, or posttranslationally regulate a metabolic pathway in 
C glutamicum^ or has one or more of the activities set forth in Table 1 . In preferred 

15 embodiments, the protein or portion thereof comprises an amino acid sequence which is 
sufficiently homologous to an amino acid sequence of Appendix B such that the protein 
or portion thereof maintains the ability to transcriptionally, translationally, or 
posttranslationally regulate a metabolic pathway in C. glutamicum. The portion of the 
protein is preferably a biologically active portion as described herein. In another 

20 preferred embodiment, an MR protein of the invention has an amino acid sequence 
shown in Appendix B. In yet another preferred embodiment, the MR protein has an 
amino acid sequence which is encoded by a nucleotide sequence which hybridizes, e.g., 
hybridizes under stringent conditions, to a nucleotide sequence of Appendix A. In still 
another preferred embodiment, the MR protein has an amino acid sequence which is 

25 encoded by a nucleotide sequence that is at least about 50%, 51%, 52%, 53%, 54%, 
55%, 56%, 57%, 58%, 59%, or 60%, preferably at least about 61%, 62%, 63%, 64%, 
65%, 66%, 67%, 68%, 69%, or 70%, more preferably at least about 71%, 72%, 73%, 
74%, 75%, 76%, 77%, 78%, 79%, or 80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 
88%, 89%, or 90%, or 91%, 92%, 93%, 94%, and even more preferably at least about 

30 95%, 96%, 97%, 98%, 99% or more homologous to one of the nucleic acid sequences of 
Appendix A, or a portion thereof. Ranges and identity values intermediate to the above- 
recited values, {e.g., 70-90% identical or 80-95% identical) are also intended to be 
encompassed by the present invention. For example, ranges of identity values using a 
combination of any of the above values recited as upper and/or lower limits are intended 

35 to be included. The preferred MR proteins of the present invention also preferably 

possess at least one of the MR activities described herein. For example, a preferred MR 
protein of the present invention includes an amino acid sequence encoded by a 
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nucleotide sequence which hybridizes, e.g., hybridizes under stringent conditions, to a 
nucleotide sequence of Appendix A, and which can transcriptionally, translationally, or 
posttranslationally regulate a metabolic pathway in C. glutamicum, or which has one or 
more of the activities set forth in Table 1 . 
5 In other embodiments, the MR protein is substantially homologous to an amino 

acid sequence of Appendix B and retains the functional activity of the protein of one of 
the sequences of Appendix B yet differs in amino acid sequence due to natural variation 
or mutagenesis, as described in detail in subsection I above. Accordingly, in another 
embodiment, the MR protein is a protein which comprises an amino acid sequence 

10 which is at least about 50%, 51%, 52%, 53%, 54%, 55%, 56%, 57%, 58%, 59%, or 
60%, preferably at least about 61%, 62%, 63%, 64%, 65%, 66%, 67%, 68%, 69%, or 
70%, more preferably at least about 71%, 72%, 73%, 74%, 75%, 76%, 77%, 78%, 79%, 
or 80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, or 90%, or 91%, 92%, 
93%, 94%, and even more preferably at least about 95%, 96%, 97%, 98%, 99%or more 

1 5 homologous to an entire amino acid sequence of Appendix B and which has at least one 
of the MR activities described herein. Ranges and identity values intermediate to the 
above-recited values, (e.g., 70-90% identical or 80-95% identical) are also intended to 
be encompassed by the present invention. For example, ranges of identity values using 
a combination of any of the above values recited as upper and/or lower limits are 

20 intended to be included. In another embodiment, the invention pertains to a full length 
C glutamicum protein which is substantially homologous to an entire amino acid 
sequence of Appendix B. 

Biologically active portions of an MR protein include peptides comprising amino 
acid sequences derived from the amino acid sequence of an MR protein, e.g., the an 

25 amino acid sequence shown in Appendix B or the amino acid sequence of a protein 

homologous to an MR protein, which include fewer amino acids than a full length MR 
protein or the full length protein which is homologous to an MR protein, and exhibit at 
least one activity of an MR protein. Typically, biologically active portions (peptides, 
e.g., peptides which are, for example, 5, 10, 15, 20, 30, 35, 36, 37, 38, 39, 40, 50, 100 or 

30 more amino acids in length) comprise a domain or motif with at least one activity of an 
MR protein. Moreover, other biologically active portions, in which other regions of the 
protein are deleted, can be prepared by recombinant techniques and evaluated for one or 
more of the activities described herein. Preferably, the biologically active portions of an 
MR protein include one or more selected domains/motifs or portions thereof having 

35 biological activity. 

MR proteins are preferably produced by recombinant DNA techniques. For 
example, a nucleic acid molecule encoding the protein is cloned into an expression 
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vector (as described above), the expression vector is introduced into a host cell (as 
described above) and the MR protein is expressed in the host cell. The MR protein can 
then be isolated from the cells by an appropriate purification scheme using standard 
protein purification techniques. Alternative to recombinant expression, an MR protein, 
5 polypeptide, or peptide can be synthesized chemically using standard peptide synthesis 
techniques. Moreover, native MR protein can be isolated from cells (e.g., endothelial 
ceils), for example using an anti-MR antibody, which can be produced by standard 
techniques utilizing an MR protein or fragment thereof of this invention. 

The invention also provides MR chimeric or fusion proteins. As used herein, an 

10 MR "chimeric protein" or "fusion protein" comprises an MR polypeptide operatively 
linked to a non-MR polypeptide. An "MR polypeptide" refers to a polypeptide having 
an amino acid sequence corresponding to an MR protein, whereas a "non-MR 
polypeptide" refers to a polypeptide having an amino acid sequence corresponding to a 
protein which is not substantially homologous to the MR protein, e.g. , a protein which is 

1 5 different from the MR protein and which is derived from the same or a different 
organism. Within the fusion protein, the term "operatively linked" is intended to 
indicate that the MR polypeptide and the non-MR polypeptide are fused in-frame to 
each other. The non-MR polypeptide can be fused to the N-terminus or C-terminus of 
the MR polypeptide. For example, in one embodiment the fusion protein is a GST-MR 

20 fusion protein in which the MR sequences are fused to the C-terminus of the GST . 
sequences. Such fusion proteins can facilitate the purification of recombinant MR 
proteins. In another embodiment, the fusion protein is an MR protein containing a 
heterologous signal sequence at its N-terminus. In certain host cells (e.g., mammalian 
host cells), expression and/or secretion of an MR protein can be increased through use of 

25 a heterologous signal sequence. 

Preferably, an MR chimeric or fusion protein of the invention is produced by 
standard recombinant DNA techniques. For example, DNA fragments coding for the 
different polypeptide sequences are ligated together in-frame in accordance with 
conventional techniques, for example by employing blunt-ended or stagger-ended 

30 termini for ligation, restriction enzyme digestion to provide for appropriate termini, 
filling-in of cohesive ends as appropriate, alkaline phosphatase treatment to avoid 
undesirable joining, and enzymatic ligation. In another embodiment, the fusion gene 
can be synthesized by conventional techniques including automated DNA synthesizers. 
Altematively, PGR amplification of gene fragments can be carried out using anchor 

35 primers which give rise to complementary overhangs between two consecutive gene 
fragments which can subsequently be armealed and reamplified to generate a chimeric 
gene sequence (see, for example. Current Protocols in Molecular Biology^ eds. Ausubel 
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et al John Wiley & Sons: 1992). Moreover, many expression vectors are commercially 
available that already encode a fusion moiety {e.g., a GST polypeptide). An MR- 
encoding nucleic acid can be cloned into such an expression vector such that the fusion 
moiety is linked in-frame to the MR protein. 
5 Homologues of the MR protein can be generated by mutagenesis, e.g., discrete 

point mutation or truncation of the MR protein. As used herein, the term "homologue" 
refers to a variant form of the MR protein which acts as an agonist or antagonist of the 
activity of the MR protein. An agonist of the MR protein can retain substantially the 
same, or a subset, of the biological activities of the MR protein. An antagonist of the 

1 0 MR protein can inhibit one or more of the activities of the naturally occurring form of 
the MR protein, by, for example, competitively binding to a downstream or upstream 
member of the MR regulatory cascade which includes the MR protein. Thus, the C. 
glutamicum MR protein and homologues thereof of the present invention may modulate 
the activity of one or more metabolic pathways which MR proteins regulate in this 

15 microorganism. 

In an alternative embodiment, homologues of the MR protein can be identified 
by screening combinatorial libraries of mutants, e.g., truncation mutants, of the MR 
protein for MR protein agonist or antagonist activity. In one embodiment, a variegated 
library of MR variants is generated by combinatorial mutagenesis at the nucleic acid 

20 level and is encoded by a variegated gene library. A variegated library of MR variants 
can be produced by, for example, enzymatically ligating a mixture of synthetic 
oligonucleotides into gene sequences such that a degenerate set of potential MR 
sequences is expressible as individual polypeptides, or alternatively, as a set of larger 
fusion proteins {e.g., for phage display) containing the set of MR sequences therein. 

25 There are a variety of methods which can be used to produce libraries of potential MR 
homologues from a degenerate oligonucleotide sequence. Chemical synthesis of a 
degenerate gene sequence can be performed in an automatic DNA synthesizer, and the 
synthetic gene then ligated into an appropriate expression vector. Use of a degenerate 
set of genes allows for the provision, in one mixture, of all of the sequences encoding 

30 the desired set of potential MR sequences. Methods for synthesizing degenerate 

oligonucleotides are known in the art (see, e.g., Narang, S.A. (1983) Tetrahedron 39:3; 
Itakurae/a/. {19S4) Annu. Rev. Biochem. 53:323; Itakura a/. {\9M) Science 
198:1056; et al. {\9S3) Nucleic Acid Res. 11:477. 

In addition, libraries of fragments of the MR protein coding can be used to 

35 generate a variegated population of MR fragments for screening and subsequent 

selection of homologues of an MR protein. In one embodiment, a library of coding 
sequence fragments can be generated by treating a double stranded PGR fragment of an 
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MR coding sequence with a nuclease under conditions wherein nicking occurs only 
about once per molecule, denaturing the double stranded DNA, renaturing the DNA to 
form double stranded DNA which can include sense/antisense pairs from different 
nicked products, removing single stranded portions from reformed duplexes by 
5 treatment with SI nuclease, and ligating the resulting fragment library into an expression 
vector. By this method, an expression library can be derived which encodes N-terminal, 
C-terminal and internal fragments of various sizes of the MR protein. 

Several techniques are known in the art for screening gene products of 
combinatorial libraries made by point mutations or truncation, and for screening cDNA 

10 libraries for gene products having a selected property. Such techniques are adaptable for 
rapid screening of the gene libraries generated by the combinatorial mutagenesis of MR 
homologues. The most widely used techniques, which are amenable to high through-put 
analysis, for screening large gene libraries typically include cloning the gene library into 
replicable expression vectors, transforming appropriate cells with the resulting library of 

1 5 vectors, and expressing the combinatorial genes under conditions in which detection of a 
desired activity facilitates isolation of the vector encoding the gene whose product was 
detected. Recursive ensemble mutagenesis (REM), a new technique which enhances the 
frequency of functional mutants in the libraries, can be used in combination with the 
screening assays to identify MR homologues (Arkin and Yourvan (1992) PNAS 

20 59:781 1-7815; Delgrave et al (1993) Protein Engineering 6(3):327-331). 

In another embodiment, cell based assays can be exploited to analyze a 
variegated MR library, using methods well known in the art. 

D. Uses and Methods of the Invention 
25 The nucleic acid molecules, proteins, protein homologues, fusion proteins, 

primers, vectors, and host cells described herein can be used in one or more of the 

following methods: identification of C glutamicum and related organisms; mapping of 

genomes of organisms related to C. glutamicum; identification and localization of C. 

glutamicum sequences of interest; evolutionary studies; determination of MR protein 
30 regions required for function; modulation of an MR protein activity; modulation of the 

activity of one or more metabolic pathways; and modulation of cellular production of a 

desired compound, such as a fine chemical. 

The MR nucleic acid molecules of the invention have a variety of uses. First, 

they may be used to identify an organism as being Corynebacterium glutamicum or a 
35 close relative thereof. Also, they may be used to identify the presence of C glutamicum 

or a relative thereof in a mixed population of microorganisms. The invention provides 

the nucleic acid sequences of a number of C. glutamicum genes; by probing the 
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extracted genomic DNA of a culture of a unique or mixed population of microorganisms 
under stringent conditions with a probe spanning a region of a C glutamicum gene 
which is unique to this organism, one can ascertain whether this organism is present. 
Although Corynebacterium glutamicum itself is nonpathogenic, it is related to 
5 pathogenic species, such as Corynebacterium diphtheriae. Corynebacterium diphtheriae 
is the causative agent of diphtheria, a rapidly developing, acute, febrile infection which 
involves both local and systemic pathology. In this disease, a local lesion develops in 
the upper respiratory tract and involves necrotic injury to epithelial cells; the bacilli 
secrete toxin which is disseminated through this lesion to distal susceptible tissues of the 

10 body. Degenerative changes brought about by the inhibition of protein synthesis in 

these tissues, which include heart, muscle, peripheral nerves, adrenals, kidneys, liver and 
spleen, result in the systemic pathology of the disease. Diphtheria continues to have 
high incidence in many parts of the world, including Africa, Asia, Eastern Europe and 
the independent states of the former Soviet Union. An ongoing epidemic of diphtheria 

15 in the latter two regions has resulted in at least 5,000 deaths since 1990. 

In one embodiment, the invention provides a method of identifying the presence 
or activity of Cornyebacterium diphtheriae in a subject. This method includes detection 
of one or more of the nucleic acid or amino acid sequences of the invention {e.g., the 
sequences set forth in Appendix A or Appendix B) in a subject, thereby detecting the 

20 presence or activity of Corynebacterium diphtheriae in the subject. C. glutamicum and 
C diphtheriae are related bacteria, and many of the nucleic acid and protein molecules 
in C glutamicum are homologous to C diphtheriae nucleic acid and protein molecules, 
and can therefore be used to detect C. diphtheriae in a subject. 

The nucleic acid and protein molecules of the invention may also serve as 

25 markers for specific regions of the genome. This has utility not only in the mapping of 
the genome, but also for functional studies of C. glutamicum proteins. For example, to 
identify the region of the genome to which a particular C glutamicum DNA-binding 
protein binds, the C. glutamicum genome could be digested, and the fragments incubated 
with the DNA-binding protein. Those which bind the protein may be additionally probed 

30 with the nucleic acid molecules of the invention, preferably with readily detectable 
labels; binding of such a nucleic acid molecule to the genome fragment enables the 
localization of the fragment to the genome map of C glutamicum, and, when performed 
multiple times with different enzymes, facilitates a rapid determination of the nucleic 
acid sequence to which the protein binds. Further, the nucleic acid molecules of the 

35 invention may be sufficiently homologous to the sequences of related species such that 
these nucleic acid molecules may serve as markers for the construction of a genomic 
map in related bacteria, such as Brevibacterium lactofermentum. 
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The MR nucleic acid molecules of the invention are also useful for evolutionary 
and protein structural studies. The metabolic processes in which the molecules of the 
invention participate are utilized by a wide variety of prokaryotic and eukaryotic cells; 
by comparing the sequences of the nucleic acid molecules of the present invention to 
5 those encoding similar enzymes from other organisms, the evolutionary relatedness of 
the organisms can be assessed. Similarly, such a comparison permits an assessment of 
which regions of the sequence are conserved and which are not, which may aid in 
determining those regions of the protein which are essential for the functioning of the 
enzyme. This type of determination is of value for protein engineering studies and may 
10 give an indication of what the protein can tolerate in terms of mutagenesis without 
losing function. 

Manipulation of the MR nucleic acid molecules of the invention may result in 
the production of MR proteins having functional differences from the wild-type MR 
proteins. These proteins may be improved in efficiency or activity, may be present in 

15 greater numbers in the cell than is usual, or may be decreased in efficiency or activity. 

The invention provides methods for screening molecules which modulate the 
activity of an MR protein, either by interacting with the protein itself or a substrate or 
binding partner of the MR protein, or by modulating the transcription or translation of an 
MR nucleic acid molecule of the invention. In such methods, a microorganism 

20 expressing one or more MR proteins of the invention is contacted with one or more test 
compounds, and the effect of each test compound on the activity or level of expression 
of the MR protein is assessed. 

Such changes in activity may directly modulate the yield, production, and/or 
efficiency of production of one or more fine chemicals from C glutamicum. For 

25 example, by optimizing the activity of an MR protein which activates the transcription 
or translation of a gene encoding a biosynthetic protein for a desired fine chemical, or by 
impairing or abrogating the activity of an MR protein which represses the transcription 
or translation of such a gene, one may also increase the activity or rate of activity of that 
biosynthetic pathway due to the presence of increased levels of what may have been a 

30 limiting enzyme. Similarly, by altering the activity of an MR protein such that it 

constitutively posttranslationally inactivates a protein involved in a degradation pathway 
for a desired fine chemical, or by altering the activity of an MR protein such that it 
constitutively represses the transcription or translation of such a gene, one may increase 
the yield and/or rate of production of the fine chemical from the cell, due to decreased 

35 degradation of the compound. 

Further, by modulating the activity of one or more MR proteins, one may 
indirectly stimulate the production or improve the rate of production of one or more fine 
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chemicals from the cell due to the interrelatedness of disparate metabolic pathways. For 
example, by increasing the yield, production, and/or efficiency of production by 
activating the expression of one or more lysine biosynthetic enzymes, one may 
concomitantly increase the expression of other compounds, such as other amino acids, 
5 which the cell would naturally require in greater quantities when lysine is required in 
greater quantities. Also, regulation of metabolism throughout the cell may be altered 
such that the cell is better able to grow or replicate under the environmental conditions 
of fermentative culture (where nutrient and oxygen supplies may be poor and possibly 
toxic waste products in the environment may be at high levels). For example, by 

10 mutagenizing an MR protein which represses the synthesis of molecules necessary for 
cell membrane production in response to high levels of waste products in the 
extracellular medium (in order to block cell growth and division in suboptimal growth 
conditions) such that it no longer is able to repress such synthesis, one may increase the 
growth and multiplication of the cell in cultures even when the growth conditions are 

15 suboptimal. Such enhanced growth or viability should also increase the yields and/or 
rate of production of a desired fine chemical from fermentative culture, due to the 
relatively greater number of cells producing this compound in the culture. 

The aforementioned mutagenesis strategies for MR proteins to result in increased 
yields of a fine chemical from C. glutamicum are not meant to be limiting; variations on 

20 these strategies will be readily apparent to one of ordinary skill in the art. Using such 
strategies, and incorporating the mechanisms disclosed herein, the nucleic acid and 
protein molecules of the invention may be utilized to generate C. glutamicum or related 
strains of bacteria expressing mutated MR nucleic acid and protein molecules such that 
the yield and/or efficiency of production of a desired compound is improved. This 

25 desired compound may be any natural product of C. glutamicum^ which includes the 
final products of biosynthesis pathways and intermediates of naturally-occurring 
metabolic pathways, as well as molecules which do not naturally occur in the 
metabolism of C. glutamicum, but which are produced by a C. glutamicum strain of the 
invention. 

30 This invention is further illustrated by the following examples which should not 

be construed as limiting. The contents of all references, patent applications, patents, 
published patent applications. Tables, Appendices, and the sequence listing cited 
throughout this application are hereby incorporated by reference. 



35 
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Exemplification 

Example 1: Preparation of total genomic DNA of Corynebacterium glutamicum 
ATCC 13032 

5 A culture of Corynebacterium glutamicum (ATCC 13032) was grown overnight 

at 30°C with vigorous shaking in BHI medium (Difco). The cells were harvested by 
centrifugation, the supernatant was discarded and the cells were resuspended in 5 ml 
buffer-I (5% of the original volume of the culture — all indicated volumes have been 
calculated for 100 ml of culture volume). Composition of buffer-I: 140.34 g/1 sucrose, 

10 2.46 g/1 MgS04 X 7H2O, 10 ml/1 KH2PO4 solution (100 g/1, adjusted to pH 6.7 with 
KOH), 50 ml/1 M12 concentrate (10 g/1 (NH4)2S04, 1 g/1 NaCl, 2 g/1 MgS04 x 7H2O, 
0.2 g/1 CaCl2, 0.5 g/1 yeast extract (Difco), 10 ml/1 trace-elements-mix (200 mg/1 FeS04 
X H2O, 10 mg/1 ZnS04 x 7 H2O, 3 mg/1 MnCh x 4 H2O, 30 mg/1 H3BO3 20 mg/1 C0CI2 x 
6 H2O, 1 mg/1 NiCb x 6 H2O, 3 mg/1 Na2Mo04 x 2 H2O, 500 mg/1 complexing agent 

15 (EDTA or critic acid), 100 ml/1 vitamins-mix (0.2 mg/1 biotin, 0.2 mg/1 folic acid, 20 
mg/1 p-amino benzoic acid, 20 mg/1 riboflavin, 40 mg/1 ca-panthothenate, 140 mg/1 
nicotinic acid, 40 mg/1 pyridoxole hydrochloride, 200 mg/1 myo-inositol). Lysozyme 
was added to the suspension to a final concentration of 2.5 mg/ml. After an 
approximately 4 h incubation at 37''C, the cell wall was degraded and the resulting 

20 protoplasts are harvested by centrifugation. The pellet was washed once with 5 ml 
buffer-I and once with 5 ml TE-buffer (10 mM Tris-HCl, 1 mM EDTA, pH 8). The 
pellet was resuspended in 4 ml TE-buffer and 0.5 ml SDS solution (10%) and 0.5 ml 
NaCl solution (5 M) are added. After adding of proteinase K to a final concentration of 
200 |ig/ml, the suspension is incubated for ca.l8 h at 37*^C. The DNA was purified by 

25 extraction with phenol, phenol-chloroform-isoamylalcohol and chloroform- 

isoamylalcohol using standard procedures. Then, the DNA was precipitated by adding 
1/50 volume of 3 M sodium acetate and 2 volumes of ethanol, followed by a 30 min 
incubation at -20'^C and a 30 min centrifugation at 12,000 rpm in a high speed centrifuge 
using a SS34 rotor (Sorvall). The DNA was dissolved in 1 ml TE-buffer containing 20 

30 |xg/ml RNaseA and dialysed at 4°C against 1000 ml TE-buffer for at least 3 hours. 
During this time, the buffer was exchanged 3 times. To aliquots of 0.4 ml of the 
dialysed DNA solution, 0.4 ml of 2 M LiCl and 0.8 ml of ethanol are added: After a 30 
min incubation at -20°C, the DNA was collected by centrifugation (13,000 rpm, Biofuge 
Fresco, Heraeus, Hanau, Germany). The DNA pellet was dissolved in TE-buffer. DNA 

35 prepared by this procedure could be used for all purposes, including southern blotting or 
construction of genomic libraries. 
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Example 2: Construction of genomic libraries in Escherichia coli of Corynebacterium 
glutamicum ATCC13032. 

Using DNA prepared as described in Example 1, cosmid and plasmid libraries were 
constructed according to known and well established methods {see e.g., Sambrook, J. et al 
5 (1989) "Molecular Cloning : A Laboratory Manual", Cold Spring Harbor Laboratory Press, 
or Ausubel, P.M. et al (1994) "Current Protocols in Molecular Biology", John Wiley & 
Sons.) 

Any plasmid or cosmid could be used. Of particular use were the plasmids pBR322 
(Sutcliffe, J.G. (1979) Proc. Natl Acad Sci. USA, 75:3737-3741); pACYC177 (Change & 
10 Cohen (1978) J. Bacteriol 134:1 141-1 156), plasmids of the pBS series (pBSSK+, pBSSK- and 
others; Stratagene, LaJoUa, USA), or cosmids as SuperCosl (Stratagene, LaJolla, USA) or 
Lorist6 (Gibson, T.J., Rosenthal A. and Waterson, R.H. (1987) Gene 53:283-286. Gene libraries 
specifically for use in C. glutamicum may be constructed using plasmid pSL109 (Lee, H.-S. and 
5 A. J. Sinskey (1994) J. Microbiol BiotechnoL 4: 256-263). 

!5 15 

m Example 3: DNA Sequencing and Computational Functional Analysis 

P Genomic libraries as described in Example 2 were used for DNA sequencing 

according to standard methods, in particular by the chain termination method using 
ABI377 sequencing machines (see e.g., Fleischman, R.D. et al (1995) "Whole-genome 

S 20 Random Sequencing and Assembly of Haemophilus Influenzae Rd., Science, 269:496- 

fy 512). Sequencing primers with the following nucleotide sequences were used: 5'- . 

S fl^^^'J GGAAACAGTATGACCATG-3;^or 5'-GTAAAACGACGGCCAGT-3^. 

Example 4: In vivo Mutagenesis 

25 In vivo mutagenesis of Corynebacterium glutamicum can be performed by passage of 

plasmid (or other vector) DNA through E. coli or other microorganisms {e.g. Bacillus spp. or 
yeasts such as Saccharomyces cerevisiae) which are impaired in their capabilities to maintain 
the integrity of their genetic information. Typical mutator strains have mutations in the genes 
for the DNA repair system {e.g., mutHLS, mutD, mutT, etc.; for reference, see Rupp, W.D. 

30 (1996) DNA repair mechanisms, in: Escherichia coli and Salmonella, p. 2277-2294, ASM: 

Washington.) Such strains are well known to one of ordinary skill in the art. The use of such 
strains is illustrated, for example, in Greener, A. and Callahan, M. (1994) Strategies 7: 32-34. 

Example 5: DNA Transfer Between Escherichia coli and Corynebacterium 
\ 35 glutamicum 

[| Several Corynebacterium and Brevibacterium species contain endogenous 

i I plasmids (as e.g. , pHMl 5 1 9 or pBL 1 ) which replicate autonomously (for review see, e.g. , 



I 
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Martin, J.F. et al (1987) Biotechnology, 5:137-146). Shuttle vectors for Escherichia coli 
and Corynebacterium glutamicum can be readily constructed by using standard vectors for 
£. coli (Sambrook, J. et al (1989), "Molecular Cloning: A Laboratory Manual", Cold 
Spring Harbor Laboratory Press or Ausubel, P.M. et al (1994) "Current Protocols in 
5 Molecular Biology", John Wiley & Sons) to which a origin or replication for and a 

suitable marker from Corynebacterium glutamicum is added. Such origins of replication 
are preferably taken from endogenous plasmids isolated from Corynebacterium and 
Brevibacterium species. Of particular use as transformation markers for these species are 
genes for kanamycin resistance (such as those derived from the Tn5 or Tn903 

10 transposons) or chloramphenicol (Winnacker, E.L. (1987) "From Genes to Clones — 

Introduction to Gene Technology, VCH, Weinheim). There are numerous examples in the 
literature of the construction of a Wide variety of shuttle vectors which replicate in both E. 
coli and C glutamicum, and which can be used for several purposes, including gene over- 
expression (for reference, see e.g.^ Yoshihama, M. et al (1985) J. BacterioL 162:591-597, 

15 Martin J.F. et al (1987) Biotechnology, 5:137-146 and Eikmanns, B.J. et al (1991) Gene, 
102:93-98). 

Using standard methods, it is possible to clone a gene of interest into one of the 
shuttle vectors described above and to introduce such a hybrid vectors into strains of 
Corynebacterium glutamicum. Transformation of C glutamicum can be achieved by 

20 protoplast transformation (Kastsumata, R. et al (1984) J. BacterioL 159306-31 1), 

electroporation (Liebl, E. et al (1989) FEMS Microbiol Letters, 53:399-303) and in cases 
where special vectors are used, also by conjugation (as described e.g. in Schafer, A et al 
(1990) J. BacterioL 172:1663-1666). It is also possible to transfer the shuttle vectors for 
C glutamicum to E. coli by preparing plasmid DNA from C. glutamicum (using standard 

25 methods well-known in the art) and transforming it into E, coli. This transformation step 
can be performed using standard methods, but it is advantageous to use an Mcr-deficient 
E. coli strain, such as NM522 (Gough & Murray (1983) J, Mol Biol 166:1-19). 

Genes may be overexpressed in C. glutamicum strains using plasmids which 
comprise pCGl (U.S. Patent No. 4,617,267) or fragments thereof, and optionally the 

30 gene for kanamycin resistance from TN903 (Grindley, N.D. and Joyce, CM. (1980) 
Proc. Natl Acad ScL USA 77(12): 7176-7180). In addition, genes may be 
overexpressed in C. glutamicum strains using plasmid pSL109 (Lee, H.-S. and A. J. 
Sinskey (1994) J. Microbiol BiotechnoL 4: 256-263). 

Aside from the use of replicative plasmids, gene overexpression can also be 

35 achieved by integration into the genome. Genomic integration in C glutamicum or other 
Corynebacterium or Brevibacterium species may be accomplished by well-known 
methods, such as homologous recombination with genomic region(s), restriction 



ATTORNEY ^p:KET NO.: BGI-123CP 

-50- 



endonuclease mediated integration (REMI) (see, e.g., DE Patent 19823834), or through 
the use of transposons. It is also possible to modulate the activity of a gene of interest by 
modifying the regulatory regions (e.g., a promoter, a repressor, and/or an enhancer) by 
sequence modification, insertion, or deletion using site-directed methods (such as 
5 homologous recombination) or methods based on random events (such as transposon 
mutagenesis or REMI). Nucleic acid sequences which function as transcriptional 
terminators may also be inserted 3' to the coding region of one or more genes of the 
invention; such terminators are well-known in the art and are described, for example, in 
Winnacker, E.L. (1987) From Genes to Clones - Introduction to Gene Technology. VCH: 
10 Weinheim. 

Example 6: Assessment of the Expression of the Mutant Protein 

Observations of the activity of a mutated protein in a transformed host cell rely on 
the fact that the mutant protein is expressed in a similar fashion and in a similar quantity 

15 to that of the wild-type protein. A useful method to ascertain the level of transcription of 
the mutant gene (an indicator of the amount of mRNA available for translation to the gene 
product) is to perform a Northern blot (for reference see, for example, Ausubel et al. 
(1988) Current Protocols in Molecular Biology, Wiley: New York), in which a primer 
designed to bind to the gene of interest is labeled with a detectable tag (usually radioactive 

20 or chemiluminescent), such that when the total RNA of a culture of the organism is 
extracted, run on gel, transferred to a stable matrix and incubated with this probe, the 
binding and quantity of binding of the probe indicates the presence and also the quantity 
of mRNA for this gene. This information is evidence of the degree of transcription of the 
mutant gene. Total cellular RNA can be prepared from Cory ne bacterium glutamicum by 

25 several methods, all well-known in the art, such as that described in Bormann, E.R. et al. 
(1992) Mo/. Microbiol 6: 317-326. 

To assess the presence or relative quantity of protein translated from this mRNA, 
standard techniques, such as a Western blot, may be employed (see, for example, Ausubel 
et al. (1988) Current Protocols in Molecular Biology, Wiley: New York). In this process, 

30 total cellular proteins are extracted, separated by gel electrophoresis, transferred to a 
matrix such as nitrocellulose, and incubated with a probe, such as an antibody, which 
specifically binds to the desired protein. This probe is generally tagged with a 
chemiluminescent or colorimetric label which may be readily detected. The presence and 
quantity of label observed indicates the presence and quantity of the desired mutant 

35 protein present in the cell. 
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Example 7: Growth of Genetically Modified Corynebacterium glutamicum — Media 
and Culture Conditions 

Genetically modified Corynebacteria are cultured in synthetic or natural growth 
media. A number of different growth media for Corynebacteria are both well-known and 
5 readily available (Lieb et al (1989) Appl Microbiol BiotechnoL, 32:205-210; von der 
Osten et al (1998; Biotechnology Letters, 11:11-16; Patent DE 4,120,867; Liebl (1992) 
"The Genus Corynebacterium, in: The Procaryotes, Volume II, Balows, A. et al, eds. 
Springer- Verlag). These media consist of one or more carbon sources, nitrogen sources, 
inorganic salts, vitamins and trace elements. Preferred carbon sources are sugars, such as 

10 mono-, di-, or polysaccharides. For example, glucose, fructose, mannose, galactose, 

ribose, sorbose, ribulose, lactose, maltose, sucrose, raffmose, starch or cellulose serve as 
very good carbon sources. It is also possible to supply sugar to the media via complex 
compounds such as molasses or other by-products from sugar refinement. It can also be 
advantageous to supply mixtures of different carbon sources. Other possible carbon 

15 sources are alcohols and organic acids, such as methanol, ethanol, acetic acid or lactic 

acid. Nitrogen sources are usually organic or inorganic nitrogen compounds, or materials 
which contain these compounds. Exemplary nitrogen sources include ammonia gas or 
ammonia salts, such as NH4CI or (NH4)2S04, NH4OH, nitrates, urea, amino acids or 
complex nitrogen sources like com steep liquor, soy bean flour, soy bean protein, yeast 

20 extract, meat extract and others. 

Inorganic salt compounds which may be included in the media include the 
chloride-, phosphorous- or sulfate- salts of calcium, magnesium, sodium, cobalt, 
molybdenum, potassium, manganese, zinc, copper and iron. Chelating compounds can be 
added to the medium to keep the metal ions in solution. Particularly useful chelating 

25 compounds include dihydroxyphenols, like catechol or protocatechuate, or organic acids, 
such as citric acid. It is typical for the media to also contain other growth factors, such as 
vitamins or growth promoters, examples of which include biotin, riboflavin, thiamin, folic 
acid, nicotinic acid, pantothenate and pyridoxin. Growth factors and salts frequently 
originate from complex media components such as yeast extract, molasses, com steep 

30 liquor and others. The exact composition of the media compounds depends strongly on 
the immediate experiment and is individually decided for each specific case. Information 
about media optimization is available in the textbook "Applied Microbiol. Physiology, A 
Practical Approach {eds. P.M. Rhodes, P.P. Stanbury, IRL Press (1997) pp. 53-73, ISBN 0 
19 963577 3). It is also possible to select growth media from commercial suppliers, like 

35 standard 1 (Merck) or BHI (grain heart infusion, DIFCO) or others. 

All medium components are sterilized, either by heat (20 minutes at 1 .5 bar and 
12rC) or by sterile filtration. The components can either be sterilized together or, if 
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necessary, separately. All media components can be present at the beginning of growth, 
or they can optionally be added continuously or batchwise. 

Culture conditions are defined separately for each experiment. The temperature 
should be in a range between 1 5'*C and 45**C. The temperature can be kept constant or can 
5 be altered during the experiment. The pH of the medium should be in the range of 5 to 
8.5, preferably around 7.0, and can be maintained by the addition of buffers to the media. 
An exemplary buffer for this purpose is a potassium phosphate buffer. Synthetic buffers 
such as MOPS, HEPES, ACES and others can alternatively or simultaneously be used. It 
is also possible to maintain a constant culture pH through the addition of NaOH or 

1 0 NH4OH during growth. If complex medium components such as yeast extract are utilized, 
the necessity for additional buffers may be reduced, due to the fact that many complex 
compounds have high buffer capacities. If a fermentor is utilized for culturing the micro- 
organisms, the pH can also be controlled using gaseous ammonia. 

The incubation time is usually in a range from several hours to several days. This 

15 time is selected in order to permit the maximal amount of product to accumulate in the 

broth. The disclosed grov^h experiments can be carried out in a variety of vessels, such as 
microtiter plates, glass tubes, glass flasks or glass or metal fermentors of different sizes. 
For screening a large number of clones, the microorganisms should be cultured in 
microtiter plates, glass tubes or shake flasks, either with or without baffles. Preferably 

20 1 00 ml shake flasks are used, filled with 1 0% (by volume) of the required growth 

medium. The flasks should be shaken on a rotary shaker (amplitude 25 mm) using a 
speed-range of 100 - 300 rpm. Evaporation losses can be diminished by the maintenance 
of a humid atmosphere; alternatively, a mathematical correction for evaporation losses 
should be performed. 

25 If genetically modified clones are tested, an unmodified control clone or a control 

clone containing the basic plasmid without any insert should also be tested. The medium 
is inoculated to an ODeoo of 0.5 - 1 .5 using cells grown on agar plates, such as CM plates 
(10 g/1 glucose, 2,5 g/1 NaCl, 2 g/1 urea, 10 g/1 polypeptone, 5 g/1 yeast extract, 5 g/1 meat 
extract, 22 g/1 NaCl, 2 g/1 urea, 10 g/1 polypeptone, 5 g/1 yeast extract, 5 g/1 meat extract, 

30 22 g/1 agar, pH 6.8 with 2M NaOH) that had been incubated at 30°C, Inoculation of the 
media is accomplished by either introduction of a saline suspension of C glutamicum cells 
from CM plates or addition of a liquid preculture of this bacterium. 

Example S—In vitro Analysis of the Function of Mutant Proteins 

35 The determination of activities and kinetic parameters of enzymes is well 

established in the art. Experiments to determine the activity of any given altered 
enzyme must be tailored to the specific activity of the wild-type enzyme, which is well 
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within the abihty of one of ordinary skill in the art. Overviews about enzymes in 
general, as well as specific details concerning structure, kinetics, principles, methods, 
applications and examples for the determination of many enzyme activities may be 
found, for example, in the following references: Dixon, M., and Webb, E.G., (1979) 
5 Enzymes. Longmans: London; Fersht, (1985) Enzyme Structure and Mechanism. 

Freeman: New York; Walsh, (1979) Enzymatic Reaction Mechanisms. Freeman: San 
Francisco; Price, N.C., Stevens, L. (1982) Fundamentals of Enzymology. Oxford Univ. 
Press: Oxford; Boyer, P.D., ed. (1983) The Enzymes, 3"^^* ed. Academic Press: New 
York; Bisswanger, H., (1994) Enzymkinetik, 2"^ ed. VCH: Weinheim (ISBN 
10 3527300325); Bergmeyer, H.U., Bergmeyer, J., GraBl, M., eds. (1983-1986) Methods of 
Enzymatic Analysis, 3^^ ed., vol. I-XII, Verlag Chemie: Weinheim; and Ullmann's 
Encyclopedia of Industrial Chemistry (1987) vol. A9, "Enzymes". VCH: Weinheim, p. 
352-363. 

The activity of proteins which bind to DNA can be measured by several well- 
15 established methods, such as DNA band-shift assays (also called gel retardation assays). 
The effect of such proteins on the expression of other molecules can be measured using 
reporter gene assays (such as that described in Kolmar, H. et al (1995) EMBO J. 14: 
3895-3904 and references cited therein). Reporter gene test systems are well known and 
established for applications in both pro- and eukaryotic cells, using enzymes such as 
20 beta-galactosidase, green fluorescent protein, and several others. 

The determination of activity of membrane-transport proteins can be performed 
according to techniques such as those described in Gennis, R.B. (1989) "Pores, 
Channels and Transporters", in Biomembranes, Molecular Structure and Function, 
Springer: Heidelberg, p. 85-137; 199-234; and 270-322. 

25 

Example 9: Analysis of Impact of Mutant Protein on the Production of the Desired 
Product 

The effect of the genetic modification in C. glutamicum on production of a 
desired compound (such as an amino acid) can be assessed by growing the modified 

30 microorganism under suitable conditions (such as those described above) and analyzing 
the medium and/or the cellular component for increased production of the desired 
product {i.e., an amino acid). Such analysis techniques are well known to one of 
ordinary skill in the art, and include spectroscopy, thin layer chromatography, staining 
methods of various kinds, enzymatic and microbiological methods, and analytical 

35 chromatography such as high performance liquid chromatography (see, for example, 

UUman, Encyclopedia of Industrial Chemistry, vol. A2, p. 89-90 and p. 443-613, VCH: 
Weinheim (1985); Fallon, A. et al, (1987) "Applications of HPLC in Biochemistry" in: 
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Laboratory Techniques in Biochemistry and Molecular Biology, vol. 17; Rehm et al 
(1993) Biotechnology, vol. 3, Chapter III: "Product recovery and purification", page 
469-714, VCH: Weinheim; Belter, P.A. et al (1988) Bioseparations: downstream 
processing for biotechnology, John Wiley and Sons; Kennedy, J.F. and Cabral, J. M.S. 
5 (1992) Recovery processes for biological materials, John Wiley and Sons; Shaeiwitz, 
J. A. and Henry, J.D. (1988) Biochemical separations, in: Ulmann's Encyclopedia of 
Industrial Chemistry, vol. B3, Chapter 1 1, page 1-27, VCH: Weinheim; and Dechov/, 
F.J. (1989) Separation and purification techniques in biotechnology, Noyes 
Publications.) 

10 In addition to the measurement of the final product of fermentation, it is also 

possible to analyze other components of the metabolic pathways utilized for the 
production of the desired compound, such as intermediates and side-products, to 
determine the overall yield, production, and/or efficiency of production of the 
compound. Analysis methods include measurements of nutrient levels in the medium 

15 {e.g., sugars, hydrocarbons, nitrogen sources, phosphate, and other ions), measurements 
of biomass composition and growth, analysis of the production of common metabolites 
of biosynthetic pathways, and measurement of gasses produced during fermentation. 
Standard methods for these measurements are outlined in Applied Microbial Physiology, 
A Practical Approach, P.M. Rhodes and P.P. Stanbury, eds., IRL Press, p. 103-129; 131- 

20 163; and 165-192 (ISBN: 0199635773) and references cited therein. 

Example 10: Purification of the Desired Product from C. glutamicum Culture 

Recovery of the desired product from the C. glutamicum cells or supematant of 
the above-described culture can be performed by various methods well known in the art. 

25 If the desired product is not secreted from the cells, the cells can be harvested from the 
culture by low-speed centrifugation, the cells can be lysed by standard techniques, such 
as mechanical force or sonication. The cellular debris is removed by centrifugation, and 
the supematant fraction containing the soluble proteins is retained for further 
purification of the desired compound. If the product is secreted from the C. glutamicum 

30 cells, then the cells are removed from the culture by low-speed centrifugation, and the 
supemate fraction is retained for further purification. 

The supematant fraction from either purification method is subjected to 
chromatography with a suitable resin, in which the desired molecule is either retained on 
a chromatography resin while many of the impurities in the sample are not, or where the 

35 impurities are retained by the resin while the sample is not. Such chromatography steps 
may be repeated as necessary, using the same or different chromatography resins. One 
of ordinary skill in the art would be well-versed in the selection of appropriate 



ATTORNEY^p:KETNO.: BGI-123CP 

-55- 



chromatography resins and in their most efficacious application for a particular molecule 
to be purified. The purified product may be concentrated by filtration or ultrafiltration, 
and stored at a temperature at which the stability of the product is maximized. 
There are a wide array of purification methods known to the art and the 
5 preceding method of purification is not meant to be limiting. Such purification 
techniques are described, for example, in Bailey, J.E. & Ollis, D.F. Biochemical 
Engineering Fundamentals, McGraw-Hill: New York (1986). 

The identity and purity of the isolated compounds may be assessed by techniques 
standard in the art. These include high-performance liquid chromatography (HPLC), 

10 spectroscopic methods, staining methods, thin layer chromatography, NIRS, enzymatic 
assay, or microbiologically. Such analysis methods are reviewed in: Patek et al (1994) 
Appl Environ. Microbiol. 60: 133-140; Malakhova et al (1996) Biotekhnologiya 1 1 : 27- 
32; and Schmidt et al. (1998) Bioprocess Engineer. 19: 67-70. Ulmann's Encyclopedia 
of Industrial Chemistry, (1996) vol. A27, VCH: Weinheim, p. 89-90, p. 521-540, p. 540- 

15 547, p. 559-566, 575-581 and p. 581-587; Michal, G. (1999) Biochemical Pathways: An 
Atlas of Biochemistry and Molecular Biology, John Wiley and Sons; Fallon, A. et al 
(1987) Applications of HPLC in Biochemistry in: Laboratory Techniques in 
Biochemistry and Molecular Biology, vol. 17. 

20 Example 11: Analysis of the Gene Sequences of the Invention 

The comparison of sequences and determination of percent homology between 
two sequences are art-known techniques, and can be accomplished using a mathematical 
algorithm, such as the algorithm of Karlin and Altschul (1990) Proc. Natl. Acad. Sci. 
USA 87:2264-68, modified as in Karlin and Altschul (1993) Proc, Natl. Acad ScL USA 

25 90:5873-77. Such an algorithm is incorporated into the NBLAST and XBLAST 
programs (version 2.0) of Altschul, et al. (1990) J. Mol. Biol. 215:403-10. BLAST 
nucleotide searches can be performed with the NBLAST program, score = 100, 
wordlength = 12 to obtain nucleotide sequences homologous to MR nucleic acid 
molecules of the invention. BLAST protein searches can be performed with the 

30 XBLAST program, score = 50, wordlength = 3 to obtain amino acid sequences 

homologous to MR protein molecules of the invention. To obtain gapped alignments for 
comparison purposes, Gapped BLAST can be utilized as described in Altschul et al^ 
(1997) Nucleic Acids Res. 25(17):3389-3402. When utilizing BLAST and Gapped 
BLAST programs, one of ordinary skill in the art will know how to optimize the 

35 parameters of the program {e.g., XBLAST and NBLAST) for the specific sequence 
being analyzed. 
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Another example of a mathematical algorithm utilized for the comparison of 
sequences is the algorithm of Meyers and Miller ((1988) Comput. AppL Biosci. 4: 1 1- 
17). Such an algorithm is incorporated into the ALIGN program (version 2.0) which is 
part of the GCG sequence alignment software package. When utilizing the ALIGN 
5 program for comparing amino acid sequences, a PAM120 weight residue table, a gap 
length penalty of 12, and a gap penalty of 4 can be used. Additional algorithms for 
sequence analysis are known in the art, and include ADVANCE and ADAM, described 
in Torelli and Robotti (1994) Comput. AppL BioscL 10:3-5; and FASTA, described in 
Pearson and Lipman (1988) P.N.A.S. 85:2444-8. 

1 0 The percent homology between two amino acid sequences can also be 

accomplished using the GAP program in the GCG software package (available at 
http://www.gcg.com), using either a Blosum 62 matrix or a PAM250 matrix, and a gap 
weight of 12, 10, 8, 6, or 4 and a length weight of 2, 3, or 4. The percent homology 
between two nucleic acid sequences can be accomplished using the GAP program in the 

15 GCG software package, using standard parameters, such as a gap weight of 50 and a 
length weight of 3. 

A comparative analysis of the gene sequences of the invention with those present 
in Genbank has been performed using techniques known in the art (see, e.g., Bexevanis 
and Ouellette, eds. (1998) Bioinformatics: A Practical Guide to the Analysis of Genes 

20 and Proteins. John Wiley and Sons: New York). The gene sequences of the invention 
were compared to genes present in Genbank in a three-step process. In a first step, a 
BLASTN analysis {e.g.^ a local alignment analysis) was performed for each of the 
sequences of the invention against the nucleotide sequences present in Genbank, and the 
top 500 hits were retained for further analysis. A subsequent FASTA search (e.g., a 

25 combined local and global alignment analysis, in which limited regions of the sequences 
are aligned) was performed on these 500 hits. Each gene sequence of the invention was 
subsequently globally aligned to each of the top three FASTA hits, using the GAP 
program in the GCG software package (using standard parameters). In order to obtain 
correct results, the length of the sequences extracted from Genbank were adjusted to the 

30 length of the query sequences by methods well-known in the art. The results of this 

analysis are set forth in Table 4. The resulting data is identical to that which would have 
been obtained had a GAP (global) analysis alone been performed on each of the genes of 
the invention in comparison with each of the references in Genbank, but required 
significantly reduced computational time as compared to such a database-wide GAP 

35 (global) analysis. Sequences of the invention for which no alignments above the cutoff 
values were obtained are indicated on Table 4 by the absence of alignment information. 
It will further be understood by one of ordinary skill in the art that the GAP alignment 
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homology percentages set forth in Table 4 under the heading "% homology (GAP)" are 
listed in the European numerical format, wherein a \' represents a decimal point. For 
example, a value of "40,345" in this column represents "40.345%". 

5 Example 12: Construction and Operation of DNA Microarrays 

The sequences of the invention may additionally be used in the construction and 
application of DNA microarrays (the design, methodology, and uses of DNA arrays are 
well known in the art, and are described, for example, in Schena, M. et al (1995) 
Science 270: 467-470; Wodicka, L. et al (1997) Nature Biotechnology 15: 1359-1367; 

10 DeSaizieu, A. et al (1998) Nature Biotechnology 16: 45-48; and DeRisi, J.L. et al 
(1997) Science 278: 680-686). 

DNA microarrays are solid or flexible supports consisting of nitrocellulose, 
nylon, glass, silicone, or other materials. Nucleic acid molecules may be attached to the 
surface in an ordered manner. After appropriate labeling, other nucleic acids or nucleic 

15 acid mixtures can be hybridized to the immobilized nucleic acid molecules, and the label 
may be used to monitor and measure the individual signal intensities of the hybridized 
molecules at defined regions. This methodology allows the simultaneous quantification 
of the relative or absolute amount of all or selected nucleic acids in the applied nucleic 
acid sample or mixture, DNA microarrays, therefore, permit an analysis of the 

20 expression of multiple (as many as 6800 or more) nucleic acids in parallel (see, e.g., 
Schena, M. (1996) BioEssays 18(5): 427-431). 

The sequences of the invention may be used to design oligonucleotide primers 
which are able to amplify defined regions of one or more C glutamicum genes by a 
nucleic acid amplification reaction such as the polymerase chain reaction. The choice 

25 and design of the 5' or 3' oligonucleotide primers or of appropriate linkers allows the 
covalent attachment of the resulting PGR products to the surface of a support medium 
described above (and also described, for example, Schena, M. et al. (1995) Science 270: 
467-470). 

Nucleic acid microarrays may also be constructed by in situ oligonucleotide 
30 synthesis as described by Wodicka, L. et al (1997) Nature Biotechnology 15: 1359- 
1367. By photolithographic methods, precisely defined regions of the matrix are 
exposed to light. Protective groups which are photolabile are thereby activated and 
undergo nucleotide addition, whereas regions that are masked from light do not undergo 
any modification. Subsequent cycles of protection and light activation permit the 
35 synthesis of different oligonucleotides at defined positions. Small, defined regions of 
the genes of the invention may be synthesized on microarrays by solid phase 
oligonucleotide synthesis. 
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The nucleic acid molecules of the invention present in a sample or mixture of 
nucleotides may be hybridized to the microarrays. These nucleic acid molecules can be 
labeled according to standard methods. In brief, nucleic acid molecules (e.g,, mRNA 
molecules or DNA molecules) are labeled by the incorporation of isotopically or 
5 fluorescently labeled nucleotides, e.g., during reverse transcription or DNA synthesis. 
Hybridization of labeled nucleic acids to microarrays is described (e.g., in Schena, M. et 
al (1995) supra\ Wodicka, L. et al (1997), supra, and DeSaizieu A. et al. (1998), 
supra). The detection and quantification of the hybridized molecule are tailored to the 
specific incorporated label. Radioactive labels can be detected, for example, as 

10 described in Schena, M. et al. (1995) supra) and fluorescent labels may be detected, for 
example, by the method of Shalon et al. (1996) Genome Research 6: 639-645). 

The application of the sequences of the invention to DNA microarray 
technology, as described above, permits comparative analyses of different strains of C. 
glutamicum or other Corynebacteria. For example, studies of inter-strain variations 

1 5 based on individual transcript profiles and the identification of genes that are important 
for specific and/or desired strain properties such as pathogenicity, productivity and 
stress tolerance are facilitated by nucleic acid array methodologies. Also, comparisons 
of the profile of expression of genes of the invention during the course of a fermentation 
reaction are possible using nucleic acid array technology. 

20 

Example 13: Analysis of the Dynamics of Cellular Protein Populations 
(Proteomics) 

The genes, compositions, and methods of the invention may be applied to study 
the interactions and dynamics of populations of proteins, termed 'proteomics'. Protein 

25 populations of interest include, but are not limited to, the total protein population of C. 
glutamicum {e.g., in comparison with the protein populations of other organisms), those 
proteins which are active under specific environmental or metabolic conditions {e.g., 
during fermentation, at high or low temperature, or at high or low pH), or those proteins 
which are active during specific phases of growth and development. 

30 Protein populations can be analyzed by various well-known techniques, such as 

gel electrophoresis. Cellular proteins may be obtained, for example, by lysis or 
extraction, and may be separated from one another using a variety of electrophoretic 
techniques. Sodium dodecyl sulfate polyacrylamide gel electrophoresis (SDS-PAGE) 
separates proteins largely on the basis of their molecular weight. Isoelectric focusing 

35 polyacrylamide gel electrophoresis (lEF-PAGE) separates proteins by their isoelectric 
point (which reflects not only the amino acid sequence but also posttranslational 
modifications of the protein). Another, more preferred method of protein analysis is the 
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consecutive combination of both lEF-PAGE and SDS-PAGE, known as 2-D-gel 
electrophoresis (described, for example, in Hermann et al. (1998) Electrophoresis 19: 
3217-3221; Fountoulakis et al (1998) Electrophoresis 19: 1 193-1202; Langen et al 
(1997) Electrophoresis 18: 1 184-1 192; Antelmann et al (1997) Electrophoresis 18: 
5 145 1-1463). Other separation techniques may also be utilized for protein separation, 
such as capillary gel electrophoresis; such techniques are well known in the art. 

Proteins separated by these methodologies can be visualized by standard 
techniques, such as by staining or labeling. Suitable stains are known in the art, and 
include Coomassie Brilliant Blue, silver stain, or fluorescent dyes such as Sypro Ruby 

10 (Molecular Probes). The inclusion of radioactively labeled amino acids or other protein 
precursors {e.g., ^^S-methionine, "^^S-cysteine, ^'^C-labelled amino acids, '^N -ammo 
acids, *^N03 or *^NH4^ or *^C-labelled amino acids) in the medium of C. glutamicum 
permits the labeling of proteins from these cells prior to their separation. Similarly, 
fluorescent labels may be employed. These labeled proteins can be extracted, isolated 

15 and separated according to the previously described techniques. 

Proteins visualized by these techniques can be further analyzed by measuring the 
amount of dye or label used. The amount of a given protein can be determined 
quantitatively using, for example, optical methods and can be compared to the amount 
of other proteins in the same gel or in other gels. Comparisons of proteins on gels can 

20 be made, for example, by optical comparison, by spectroscopy, by image scanning and 
analysis of gels, or through the use of photographic films and screens. Such techniques 
are well-known in the art. 

To determine the identity of any given protein, direct sequencing or other 
standard techniques may be employed. For example, N- and/or C-terminal amino acid 

25 sequencing (such as Edman degradation) may be used, as may mass spectrometry (in 
particular MALDI or ESI techniques (see, e.g,^ Langen et al. (1997) Electrophoresis 18: 
1 184-1 192)). The protein sequences provided herein can be used for the identification 
of C. glutamicum proteins by these techniques. 

The information obtained by these methods can be used to compare patterns of 

30 protein presence, activity, or modification between different samples from various 
biological conditions {e.g., different organisms, time points of fermentation, media 
conditions, or different biotopes, among others). Data obtained from such experiments 
alone, or in combination with other techniques, can be used for various applications, 
such as to compare the behavior of various organisms in a given {e.g. , metabolic) 

35 situation, to increase the productivity of strains which produce fine chemicals or to 
increase the efficiency of the production of fine chemicals. 
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Equivalents 

Those of ordinary skill in the art will recognize, or will be able to ascertain using 
no more than routine experimentation, many equivalents to the specific embodiments of 
the invention described herein. Such equivalents are intended to be encompassed by the 
following claims. 
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APPENDIX A: DNA SEQUENCES 



>RXA00004 -upstream 

TTGCACTGTCATGACTGTATCCCGCGAAGAAGTGTCCCTGCCGAGCCGAACTCTGAACAA 
TGCCTTCCGGAAGTATTTTCCAATTCCCGATGTAGGGTCA 

>RXA00004 

GTGCTGACTCAATTGATTGAATCATCGATTTTCGACAACGTTGCGAGCAGGGAGTCCTCT 
GAATTTCTCGGCCATGCTGCCATCGATCTACTTGCTGGCCTTGTCTATGAAAAAGCCACT 
CCCTATGCTCCAGATGAAGCACTTAGAGTGGCAGTTTATGGCTATATTCGGGAGAACCTT 
GGATCCTCACAACTTACGGTCGCAGCTGTAGCCGGGGCGCATAGAATCGCGGTTCGTACG 
TTGCATCGATTATTTGAAGGCGAAGCATACGGAGTAGCGGAATTAATCCGACACCTCCGA 
TTAGAGGCAGTATATGAAGACCTTCGGGATCCTCGCCTCCAGAACCTGACCATTTTGGCT 
ATCGGCATGCGCCACGGCATTTCCAGCCAAGCTCATTTAACAAGACTGTTTCGCGCTAAA 
TATGGGGTACCGCCGGCAGAGTTTCGCCGAGGGTATATTAATAGCGCTGCT 

>RXAO 000 4 -downstream 
TGAGGGCACCGCAAGCGTGGCGC 

>RXAO 000 6-upstr earn 

AACCTATAGGAAAACGGGAGGTCAGATTACGCTCTTGTGTTACACATCACAGGATATGGT 
AGAAGATGTTCGCCTATCCACCCCGTGAAAGAAGATAAAA 

>RXA0000 6 

ATGGTCGATTTTGACACCATCGCAGCCCGACTTGTCACCGAAACAGAAGAAGCAATCATC 
TACGCCACCCGCGATGGAATAATCAGACTCTGG7UVCGGCGGCTCCGAGAAACTCTTTGGA 
TACACGGCCGGCGAAGCCCTTGGAAAATCACTCGACATCATCATTCCCGAAAAACACCGC 
AAGGCCCACTGGGACGGATGGGATCGCGTCATGGAATCCGGCGAAACTCGCTATGGCTCC 
GAACCGCTTAACGTTCCAGGCATTCGTGCCGATGGATCCAAAATGTCTTTGGAATTCTCC 
ATCACCATCCTGAAGGACGATTCCGGAAAAATCGAAGGCGTTGCAGCTTTTCTCCGCGAT 
GTCACCGCCAATTGGGATGAGAAAAAGGCCCTGCGGATCCGAATCAAAGAGTTGGAACGC 
CAAATCGAGGGCCAT 

>RXA0000 6-ciownstream 
TAAGGAGATTCTTGGGTGCGCGG 

>RXA0 002 9-upst ream 

TCCTCGCCCGCACCCACGGTGGAGCGAAACGTGTGCCCAAACGCGGTACTCCCCAACCGT 
TTGCCGTGCGCCAGGTGAGGACCGAGAGGCCAAACGCATT 

>RXA00 02 9 

TTGGCCCAGGCCACCGCCCAACTAATCGCTGATGATGAAGCAGTAATTTTCGACAACGGC 
ACCACCTGCCAGGCAGTGGCCCAAGAGTTGGCGGGGCGTCCCATCACGGCATTGTGTCTG 
TCTCTACATTCGGCGGTCGCCCTGGGAAGCCGAGCTGGCACCAACGTTTTCATCCCCGGC 
GGCCCCGTGGAAAACGACTCACTCGCCTTATCTGGCCCGGCTGTGATCACCGCGTTACGA 
GATTTCTCCGCCGATGTCGTGATCCTCGGTTCCTGCTCTACATCACTGGAGCACGGGTTG 
GCCACCACTACCTACGACGATGCGGATiAACAAGCGCGCAGCCATCCATGCTGCCACCCGA 
CGAATCCTTGTGGTGTCCGCCCGTAAACTCAACCACGTTTCCACTTTCCGTTTCGCAGAC 
GTCGCGGACTTACACCAGCTGGTCACAACCTCCGATGCGCCACGGGAGATTCTCGCCGAG 
ATCCGGGATCTCGGCGTGCAGGTTATTACTGTTCCCGCCCCTGACGAGCAACGAAGT 

>RXA0002 9-downstream 
TAACTCTTCATGGTTGCTGAGCA 

>RXA0012 6-upstream 

CAGAAGTAGTATAAGAAGTGAAGTTGATCGCGTCATCGTAGTACGAGTAATTCCACTCAC 
ACATCAATGCGGTGACCACAATTGGGAGGAGAAGTAGCAC 

>RXA0012 6 

GTGACTACACCTGCTGAGAACAACACCCTTAGCCCCGAGACCAAAGTAAGCATCACTGGT 
CGAAACGTTGAGGTTCCTGATCACTTTGCAGAACGAGTAAATACCAAACTCGCAAAGATT 
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GAGCGCCTCGACCCAACGCTGACCTTCTTCCACGTTGAGCTACAGCACGAGCCAAACCCA 
CGTCGTGCTGACGAAAGTGATCGCATTCAGATCACCGCCACCGGCAAGGGACACATCGCC 
CGAGCAGAAGCAAAGGAAGACAGCTTCTACGCGGCACTGGAAACTGCACTAGCCAAGATG 
GAGCGCTCCCTGCGCAAAGTGAAGGCACGTCGCAGCATTTCCCGCTCCGGTCACCGCGCA 
CCACTAGGCACTGGTGAGGTCGGTGCACAGTTGGTAGCCGAGTCCCAAGAGGCACGCGGT 
GCCGATGAACTGGGCAAATACGATGTTGATCCTTATGCAGATAAGGTCGATGACGTCATG 
CCAGGCCAGGTTGTTCGTACCAAGGAACACCCAGCAACCCCAATGAGTGTGGATGACGCA 
CTATCCGAGATGGAATTGGTTGGACACGATTTCTACCTCTTCGTCAACGAAGAGACCAAC 
CAGCCATCGGTGGTGTACCGCCGACACGCATTCGACTATGGATTAATTTCCCTGTCCGAT 
GCA 

>RXA0012 6-downstream 

T AGCAAT TAGT TGC T AAGT ACC C 

>RXA0012 9-upstream 

AGTTTTTCATTTAAAAAAGGGGCAGTTTCTCATTCTTGCCTGGCTCACGCGACTTCGACA 
TCGCATTGTAGATAAATGGCGAACCTCACTGCAGGTTCGC 

>RXA0012 9 

GTGCTCGGCTCCATCTTCACCGCATCAGCTGTCGTGATGATCCTTTTGGGGCTGGGCATG 
CTGACTGTATTCACCCAACGGTTGGTGGATCAGAAAATCGATATTGCGAGCTCCGAAATC 
GACCGCGCCCGCGTCATCGTCGAAGAGCAAATCACCGCATCCGGCGCCTCAACATCGGTG 
CAGGCGCGAGTGAACTCTGCCCGCGCTGCGCTCTCCAGCTTGGGTACCAGCGGCGGTACA 
GAAACCAACGCCGCCTACGATCCAGTCGTGTTGGTGAACAACGATGACCTGGTGGTCTCT 
CCCGAGGGTTACCAAATCCCAGAACGTCTGCGATACTTCGTCTCTGAGAACCAAGTCTCG 
TATCAGTTCTCCAGCATCGACCAAGGCGACGGATCGTCCTACCAAGCGCTCATCATCGGA 
ACGCCCACGGAAAGCGACATCCCGAACCTCCAGGTGTATCTGGTGTTCTCCATGGAAAGC 
GACGAATCCTCTCTTGCTCTCATGCGAGGACTCCTCTCAGCTGCACTGCTGATCGTGGTG 
GTGCTGCTGGTCGGTATCGCATGGCTAGCCACCCAACAGGTCACCGCGCCGGTGCGTTCG 
GCGAGCCGGATTGCGGAGCGTTTCGCTCAAGGCAAACTGCGTGAACGCATGGTGGTGGAA 
GGCGAAGACGAGATGGCCCGCCTGGCGGTGTCCTTCAACGCGATGGCCGAATCGCTGTCC 
GCGCAGATCACCAAATTGGAGGAATACGGCAATCTGCAACGACAATTCACATCGGATGTC 
TCACACGAATTGCGCACACCGCTGACAACGGTGCGCATGGCTGCTGATCTAATTGCCGAT 
AGTGAAGATGAACTTTCACCCGGTGCGCGCCGCGCCAGCCAACTGATGAACAGGGAGTTG 
GACCGATTCGAGTCGCTGCTGAGCGATCTGTTGGAAATTTCCCGACACGACGCCGGCGTT 
GCCGAACTGTCCACCGCGCTTCACGATGTCCGCATCCCAGTGCGATCGGCATTGGAACAA 
GTACAACACTTGGCCACCGAGCTCGATGTGGAATTGCTTGTTAATTTGCCCGAAGAAGCG 
ATCAACATTCAAGGCGATTCCAGGCGCATCGAAAGAATCATTCGCAACCTTCTAGCCAAT 
GCGATCGACCACTCCAAGGGCTTGCCTGTTGAGTTGTiAAGTTGCCGACAACGTGGACGCA 
GTAGCGATCGTTGTTATTGATCACGGCGTCGGCCTGAAACCTGGACAAGACGAATTGGTG 
TTCAACAGATTCTGGCGAGCCGACCCTTCGCGCGTCCGCCATTCTGGTGGCACCGGCCTG 
GGTCTTGCGATTTCTCGCGAAGATGCGATGCTTCATGGAGGAAACCTTGATGCGGCGGGA 
ACGATCGGTGTTGGTTCCATTTTCCGTTTGGTCTTGCCTAAAGAACCGCATGGAAATTAT 
CGTGAAGCACCGATCCCGTTGATCGCTCCAGAGACACCGTGGGAAGGGGAGCAGCAG 

>RXA0012 9-downstream 
TGAGTAAAATTTCGACGAAACTG 

>RXA00130-upstream 

GATCGTGGTTGCCCCTGATGAAGACCCCGGCCACGTTGCGCAGAGAATCGTGGAATTCCT 
GGGTACTATAAACTAATCCCAATTAGCAGGAAGGATTCTC 

>RXA00130 

ATGTCACAGAAAATTCTCGTGGTTGATGATGATCCCGCCATCTCCGAGATGCTCACCATC 
GTGCTCAGCGCAGAAGGCTTTGACACCGTAGCTGTCACCGACGGCGCACTCGCCGTGGAA 
ACCGCCTCCCGGGAACAACCGGATCTGATTTTGCTCGACTTGATGCTTCCAGGCATGAAC 
GGCATCGACATTTGTCGCCTCATCCGCCAAGAATCCTCCGTACCCATCATCATGCTCACC 
GCCAAAACCGACACCGTTGATGTGGTGCTCGGTTTGGAATCCGGTGCAGACGATTACGTG 
AACAAGCCTTTCAAAGCGAAAGTyVCTTGTCGCCCGCATCCGTGCCCGCCTCCGCGCAACC 
GTGGACGAGCCCAGCGAAATCATCGAAGTCGGCGATCTGTCCATCGACGTCCCAGCACAC 
ACCGTCAAACGAAACGGCGCTGAGATTTCCTTGACCCCACTCGAATTCGACCTCCTGCTG 
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GAACTCGCCCGCAAACCACAGCAAGTATTCACCCGTGAAGAATTGCTGGGCAAAGTGTGG 
GGCTACCGCCACGCATCCGACACTCGACTGGTCAACGTTCACGTTCAGCGTCTGCGCGCC 
AAGATTGAAAAAGATCCAGAAAATCCGCAGATCGTCCTCACCGTCCGCGGTGTTGGCTAC 
AAAACTGGCCACAACGAT 

>RXA00130-downstream 
TAAGTTTTTCATTTAAAAAAGGG 

>RXA00182-upstream 

CAGTGGTGATTTTTCAAATAACCGGTAGGCTGGAGTACTTGTCCAATGGTTTCGCACAAG 
GTTACGAGAAACCACTCTTCAACATAGGTGAAATGAATTC 

>RXA0018 2 

ATGACTTCTCACTTGCTTCACGGTCTCTGGATCAAAGATCGCGGTCTGCAACTGTGGATT 
GAGCAGGTCGAAGGGCACCGAATTGTGCTTCCAGAGGCGGTGGAAAAAGGCACGTTCCCG 
CCGGTAGTGGAGCAAATCCTCGACGGGAAAACCTTCCGCGCGCGCATGAATGTGCATCTG 
CGCACTCCGAAAGGGCGCCATGTTGAGCTGCCCACGCCAACAGCAGCTTTTACCCCTGAA 
GAAGCAGTCACGGTTTTCTCACAATTAAGTTTTTTGAAAGCAGAAACCCCTGCCGCCACC 
CGAGCGCAACGTGACTCCATTGCGCCCGATCTGTGGTGGCTGATTGTCATGTATCAAGGC 
CTGGCGCGTTTTGTGCAGGCCGGCCGCGTCACGCTTCGCACGGTGATGATGGATAATGCC 
TGGTGGCCCCAGTGGCAACTATCTGCCAGCCTGTCGGAGCGTGGGTGGCTCGCGGAAATG 
AACCATGCCGCGCCGGGTATTTTGCGGATTAATGGTGGCCGAGATTTGGCCGGAAGCATG 
TCCAATGAGCTTCCGCACTGGATCGCCAACGCCATTTTGCGTGATTACCGCGATGAAACC 
ATGCCGTATGCGCGCCATGAGTTTGTTGAGGCGTTGCTGTTTAACCATTCGCTGCGCAAG 
GGCTCGACCATGCTCACCCATGCGCTGAATCAGTGGAAAAACACCATTACATCTGCGTCT 
TTGCAGCTGGTGATTTTGGTGGAGGAGCCCCCTGCGGAATCGGATTATGAAGATCCGATG 
GATTCCGTGTGGCCGGTGCGTTTGATGGTGCGCACGGGCGTGGATGCGCCGCAGGCGATT 
CAAAAAGGATCGATCGATAGCGGGGGAATGGAGCAACTGCGCTCCCAATACGAAACCGCA 
AAAACCACCTCCATGCTGCTGGATCCTGCGCGCGAAGACGCGATGCTGGGGCATATGGTG 
GACATCGCCCAAAACGGTGACTGGGATATTTTCTTAACCACCGAGGAGATCGTCAACTTT 
ATCTCCCACGATGTAGCCAAGCTGCGCAAAGCCGGCATTCCCGTCATGCTGGCCAAAGCC 
TGGAGCACCTATGAAACCCGCGCGCAGGTGGAAGCGCGCACGCCGAATGATGCCGCGGAC 
TCTTCAACCAAGGCAATCATTGGTCTTGACCAGCTCGTGGAATACAACTGGCGCATCAGC 
GTCGGCGATATTCAGCTGTCCGACGAAGAGATGCGCGAACTCATCGATTCCAA7\ACAGGC 
CTCATCCGCCTGCGCGGCGATTGGGTCATGGCGGACCAAGACGCGTTGCGACGCATCACC 
AGCTACATGGAGGAACTATCAAAGTCCTCCGAAAAACGCGCGCGCACCGAAATGGAAAAA 
GTGGCCATGCAGGCCAAACTCGCCGAAGCAAACGGCGAAGAAGGTTGGCAACTCCTGGCT 
GCCAAGGCTGAAACTCTCCGCAAGGAATTCAATGAGAAATTCAGTGGCGATGGACT^GGC 
GAAGTAACCCTTGCTGAACTGCGCGAAATCGCACTGAAAGCCGCCGAAAACGAACCAGTG 
GAATTCACCGGCTCGCAATGGTTCAACTCCTTGCTCGGCGGCACCGAAACACCCGCGCCG 
GTGCGCGTCGACATCCCCGACACGGTCCTTGCTGACCTGCGCGAATACCAGCGACGCGGC 
GTGGACTGGCTGTACTGGATGTCCGCAAATAATTTAGGTGCAGTGCTTGCCGACGACATG 
GGCTTGGGAAAAACCCTCCAGTTGTTGTCCCTTTTGGCAGTCGAGCGCGCAGAAAACCCA 
GAGTTGGAACGCGGCCCGACGCTCGTGGTGTGCCCAACATCTGTGGTGGGAAACTGGGCA 
GCCGAGGCGGCTAAATTTGTGCCTTCACTAAAGGTATTGATGCATCACGGCCCGCAGCGT 
TTGAATGATGCCGATTTCCTGAGTCAATCCAAGGGCATGGACTTGATTATCACCTCATAC 
GGTGTGATCACCCGCGATTTCAAACTCATGGGCCAGGTAGGTTTTGAACGCGTCGTGCTC 
GATGAAGCGCAGGCAATTAAAAACTCTTCCACACGCGTATCAAAGGCAGTGCGCTCGTTG 
CCTTCCCGCCACCGCGTTGCACTGACTGGCACGCCCGTGGAAAACCGCCTGTCTGAAATG 
CGCTCCATCTTGGACTTCTGCAACCCCGGCGTGCTGGGTTCTGCATCGTTTTTCCGCAAC 
CACTTTGCCAAGGCCATTGAGCGTGAACAAGACGACACCATGACTGAGCGCCTCCGCCAG 
CTCACCGCACCGTTTATTTTGCGACGCCTCAAAACCGACCCGAATATCATCGACGATCTG 
CCTGAGAAAACCGAACAGATCATCCGCGTGGATATGACCACCGAACAGGCATCTCTGTAC 
AAAGCGCTGGTTGAAGATGTGCAGAAACAACTCGATGAACGCCAAGGAATGTCACGTAAA 
GGCCTGGTCCTGGCGACCATCACGCGCATCAAGCAGATCTGTAACCACCCAGCGCACTTC 
CTTGGCGACGGCTCAGAGGTGACACTGAAAGGTAAACACCGTTCCGGCAAGGTGGAAGCG 
CTCATGGAACTGATCGATACGGCAGTAAAAGAAGAGCGCCGCATGCTGATCTTCACCCAG 
TACGCAGCCTTTGGCCGTATTTTGGCGCCGTACCTGTCTGATCGATTGGGCACGAATATC 
CCATTCCTGCACGGCGGCGTGACCAAACCAGGACGCGACCGCATGGTGGCGGAATTCCAA 
TCCGAAGACGGACCACCGGCAATGATCCTGTCTTTGAAAGCCGGCGGTACTGGCTTGAAC 
CTGACAGCTGCATCCATCGTGGTCCACATGGATAGATGGTGGAACCCAGCCGTGGAAAAC 
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CAAGCAACTGACCGTGCCTTCCGCATCGGCCAGCGCAAAAACGTGGATGTATACAAGATG 
ATCACAGTCGGAACCATGGAGGAATC'CATCCAAGATATCCTCGATGGAAAAACACACCTG 
GCCAGCGCCATCGTGGGCGAGGGCGAAGGCTGGATCACCGAACTCAACCCAGAAGAATTG 
GCTATGCTGATGAGTTACCGCGAAAAGGAGGGTGCAGATGAC 

>RXA00182-clownstream 
TGAATCACGCCGCGTGAAAATGG 

>RXA00221-upstream 

TCGGCACGATCTTGCCGGTTTTCCTTGTTTAGGCTCCGAAACGGATGTATACTGCAAAGA 
ACGGCACGATCATGCCGAAAGTTGGATCCAAAGGATGCAC 

>RXA00221 

ATGAACGCAGAAGAAATCGG7VATGGCGCTGCTCAACGGACGCAAAGAGCTAGGCCTTAGA 
CAAGGAGAGCTCGCAGACTTAGCTGGAGTTTCTGAACGATTCATCCGCGATGTCGAAAAG 
GGAAAAACTACCGTCCGCCTGGACAAAGTCATCGATGTACTCCGCGTCCTTGGACTCGAG 
CTTTCTGTTGGAATTCACGATCCCCTCAAGGTTAATCAA 

>RXAO 0221 -downstream 
TGACCCCCACTGCCGATATCTGG 

>RXA002 5 3 -upstream 

ACCCATGTTAGATGTTTTATTCAGGGATTTTAGTTGATATGTCCAGTATCTCGCTGAAAA 
CGCTGGTTGTCTTGTAGAAAAAGGCGTAACGTCATATAAC 

>RXA0 02 5 3 

ATGCCTAGCGAAACTATGAAACCAGCCGTAGCGTCAACTCTGGCGGCCACTTCCACGGGA 
CGTCGTCCTGGACGCCCCACCCAACGTATCCTTTCCGTCGAATCCATAGTGGAGCGCACT 
TTAAACATTGCCGGCCGCGAAGGATTCGCTGCCGTGACCATGAACCGCCTCGCCCGAGAC 
ATGGGTGTCACCCCTCGCGCACTGTATAACCATGTGCTAAATCGTCAAGAAATCATTGAT 
CGCGTCTGGGTGCGCATCATCGATGATATCAAGGTGCCCGATCTTGATCCGGACAATTGG 
CGGCAATCTATTCATACGCTGTGGAGCTCATTGCGCGACCAATTCCGTGAGACTCCACGT 
GTTCTTCTGGTCGCGCTGGATGAACAGATCTCTACTCAGGGCACTTCCCCACTGCGAATC 
GCGGGTGCGGAGGAGTCCTTGAAGTTCTTGACTGATATCGGGCTGTCCCTCAAGGAAGCA 
ACCATCATCCGGGAGATGATGATGGCTGATGTCTTCAGCTTCACCCTGACTTCTGACTAC 
ACCTTTGACAATCGTCCAGAGGGCGAAAAGCCGGATGTGTTTGCTCCGGTTCCTAAGCCA 
TGGCTTGATGAGAACCCAGATGTGGAAGCGCCACTGACCCGTAAAGCAGTCGAAGAGTCC 
GTCTCAACTTCTGACGAACTCTTCGGCTACATGGTGGAGGCTCGCATTGCTTATATTGAA 
AAGCTGCTTGCCGCCAAA 

>RXA00253-downstream 
TAGTTTCTAAAGGTTATTGAGGG 

>RXA002 8 4 -upstream 

AGGACAACCAAATTGGCGCCTCTATCGGGTGCTCACCACACCTGAACAGTCGTAGACTCT 
TTGTAACTACCGTTGTTGTTTTTCCATATCCAGGAGAGTC 

>RXA002 8 4 

ATGGCCCGCAAGCTGAAGGACAAACTTCCCCGAAGTTTTGACAAAATCGTCGAATCGGGC 
GATTTTGACGCTTTCAAAGAGGTCTTCACCGAGCGCGCCCTCGACGCCAAAAACCGTCAT 
GGCAACACTGCCCTCCACATGCGTGGAGTACCTGAAGAATTCAAGATTTGGATGCTTGAC 
CAGGGCCTCGATGTGGATATCCGCAACGAAGACGGCGACACCCCGCTGCACGTGCACTCC 
CATGACTGGAACTTAAGCCCCGATTTTCTGCTCAAACGCGGCGCCGATGTCTGCGCAGTC 
AACAATGAAGGCGAATCGGTTGCCTACTCCGCTGCCTTCTTCCCAGAAAACCTCAAAAAG 
CTTATCGACGCCGGCGCCGACCCCTACTCGCGCGCCAACGACGGCACCACGCCGCTGATG 
CGTGTCATTCGAAGCGCCGACACCGGACAAATCATCGAACTAGCAGAAATAACCAAGCTA 
CTTTCCGGCACAGAATTCACCGACGCAGAATTCCGAGAAACCCAAGAACGCATCATCGCA 
ATGGGTGAAAGATTCGAAGATGTCCGGGAAGTCTACAACGAAGAATCCGTCGACCAAGCA 
TCTGCGGACATGATCTGGCTCTACGATCGTTTCGACATCCCCGAAGAACTCCGCGCCAAC 
ACACCAATTTTGCACGACGGAGTAAGCCCAATAGAACTGCCTGGGGATACCTGGCAAGAA 
CAATTCATCGAAGGCTACGATCTCCTCGTTCCCGCAATGGGCAAAGCGAAATCCCTGCAA 
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GGCGAAGCCATCCGAATTGCCGGACGAGTATCCAACGAATTTCACGGCAACGGTGGCGTC 
AACTGGGACAAAGACTTCAAACGCATGGCCAAATCTCTCAACCACATTTGTGAGCAGGGC 
GTTCCTTTGGGTGAGCCAGAATTAGAAGAACTGGCTGCGGCCGTTAAATCAGTGCGCAAA 
GGAGAACCCACCGAGGAGGAGATCGACACCCTTCCACGGTTGGCCACCAAATGGGTCGCA 
CAAAACCCACAACCGCTGCCACTGGGAGAGGTTGACTACAAGCGC 

>RXAO 02 8 4 -downstream 
TGAACGTTGAGTTTGAGTTTGGT 

>RXA002 8 7 -upstream 

ATGGGCTCAACGCTGCAAACATCATGGAGGGGCACCGGCTCGTTGAGCAGGGTAAAACCT 
CAGG7VAAAATTGTTGTGAGGGTATAAAGAGGACTTGAAAA 

>RXA002 87 

ATGCACCATCTACGCTATGAATCACCAATCGGAGAGCTTCTTCTTGTTGCAAGTGACCAA 
GGGCTAACCTATGTGGCATTCTCCGATGAAAACTACGCAGCTTGTACTGTCGGGTCGACC 
CCGGGAACCAATGCGGTGCTGGAACAGGCAGTTGCTGAGCTTGAAGAATACTTCGCAGGG 
AAACGTAAAGAGTTCAGCACTCCCCTGGATTGGCCAAGCCAAAATCTGCTGAGCTTCCGC 
GGTAAAGTGCAGGAATTTTTGCTGTCCATTCCTTATGGGGAGAGTAAAACTTACAAACAG 
ATCGCCGCTGAGCTTAATAATGTGGGCGCGGTTCGTGCAGTGGGAAGCGCCTGCGCCACC 
AACCCCTTGCCAATCTTTGCTCCTTGTCACCGAGTACTGCGCACTGATGGGGCGTTAGGT 
GGCTACAGAGGAGGCTTGGAAGCAAAACAGTGGCTGTTGGAGCTGGAACGTCCT 

>RXA0028 7-downstream 
TAGTTTGTGTCCGCGCACGGAGC 

>RXA002 91 

GCCGCCCTTGCTCTTATTTCTGTGTTGGGAATCCTTATCGGCGTGGGTGTAGCCATGGGC 
ATGCGACGCCGTTGGGAACGCGTGACCTTGGGTTTGCAGCCGGAGGAGCTAGTGACCCTT 
GTGCAAAATCAGACTGCAGTCATCGATGGCATTGATGAGGGCGTGCTGGCGCTGAGCCCA 
AACGGAACAATTGGGGTGCATAATGAGCAGGCGCAATCCATGATTGGTGCAGGTCCTATG 
AGTGGCAGGACGTTGAAAGAACTAGGGCTTGACCTGGGTCTTGATGGCGTTGTATTGCAT 
GGTCAGCATCCGGAAACCGTTGCCCATAACGGCAGGATCCTCTATCTGGATTTCCACCCC 
GTGCGCCGTGGGGATCAAGATTTAGGCTACGTGGTAACCATCCGCGATCGTACCGACATC 
ATTGAACTCAGTGAACGCCTCGACTCTGTGCGCACCATGACCCACGCACTCCGCGCCCAG 
CGCCACGAGTTTGCCAACCGCATCCACACCGCAACAGGGCTTATCGACGCCGGCCGCGTC 
CACGACGCGGCAGAGTTTCTAGGCGATATATCCCGCAACGGGGGACAGTCACATCCATTG 
ATCGGATCAGCGCACCTCAATGAAGCATTTTTGAGCTCATTTTTAAGTACTGCTTCTATT 
TCGGCATCTGAAAAGGGCGTTAGTCTGCGCATCAACTCTGACACGCTCATCCTTGGCACT 
GTTAAAGATCCAGAAGATGTAGCAACCATTTTGGGTAATTTAATCAACAATGCCATCGAC 
GCCGCGGTGGCAGGTGAAGCCCCACGGTGGATTGAGCTTACGTTGATGGATGATGCCGAT 
ACGCTGGTCATTTCTGTTGCAGATTCTGGTCCTGGAATCCCAGAGGGCGTGGATGTATTT 
GCCACAGCCACCCAGATAGGAGACTCTGAAGATAATGAACGCACCCACGGGCATGGCATT 
GGTCTAAAACTGTGCCGGGCTTTGGCTAGATCACATGGTGGCGATGTCTGGGTGATTGAT 
AGAGGAACCGAAGATGGCGCTGTATTTGGAGTGAAACTACCGGGAGTAATGGAG 

>RXAO 02 9 1 -down s t r e am 

T AATGGAT CAAACACT T AAAGT T 

>RXA002 92-upstream 

GGGCTTTGGCTAGATCACATGGTGGCGATGTCTGGGTGATTGATAGAGGAACCGAAGATG 
GCGCTGTATTTGGAGTGAAACTACCGGGAGTAATGGAGTA 

>EOCA00292 

ATGGATCAAACACTTAT^GTTTTAGTAATTGATGATGATTTCCGCGTCGCCGGCATTCAC 
GCCTCCATCGTTGATGCGTCCCCTGGATTTTCGGTGGTCGGTACCGCGCGTACCCTCGCA 
GAGGCAAAAACCCTGATCGCCACATTTTCCCCGGATCTCCTACTTGTTGATGTCTACCTC 
CCCGACGGCGATGGCATTGACCTCGTGGGCACCTCCAATATTGATGCGTTTGTGCTCAGC 
GCAGCCGATGACATCAAAACAGTTCGACGCGCCATGCGTGCCGGGGCACTCGGATATCTG 
CTCAAACCATTTCCCCAAAAACGTCTCGTGGAACGCCTTGACCGTTACGTCCGCTACCGC 
CATGTCTTATCCGGCACCCAAGGACTTTCCCAAGACAAAATTGACCAGGCAACCGCAATC 
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CTCAACGGCACCCAAGCGCCGGTCACCGTCTCTAGATCCGCCACAGAGCAATTACTTCTC 
GACGCCCTGGAAGGCCAAGAACTCTCCGCAACAGAAGCTTCCGAAGCTGCCGGAGTTTCA 
CGTGCCACAGCACAGCGCAGGCTGGCAGCGATGGCTAGCCAAGGTGTGATCCAGGTTCGC 
CTTCGGTACGGACAGTCCGGGCGACCAGAACATCTATATTCAAAGCCACTGCTC 

>RXA002 92-downstream 
TAGTAACCTTTGTGGATGTCCAC 

>RXAO 0307 -upstream 

GTGTTTTAAGAAGTGTTTTTAAGAGAATACGCATTGAAGTAGTTTTCCCTGCTGGCAGCG 
GCATAAATTGAGTTTGGAAAAACAAGGAAGGCAGCCTCCT 

>RXA00307 

GTGAAGGATCTGGTCGATACCACCGAAATGTATCTGCGCACTATTTACGAGCTGGAAGAA 
GAGGGCATTGTTCCTCTGCGTGCTCGTATCGCAGAACGCCTTGAGCAGTCCGGCCCAACT 
GTCAGCCAGACTGTCGCCCGTATGGAACGCGACGGTCTTGTGCACGTCAGCCCCGACCGC 
AGCCTCGAAATGACTCCAGAGGGACGTTCCCTCGCCATCGCCGTGATGCGTAAGCACCGC 
CTAGCAGAACGCCTCCTTACCGACATCATCGGCTTGGACATCCACAAAGTCCACGACGAA 
GCATGCCGCTGGGAGCACGTGATGAGTGATGAGGTTGAACGTCGCCTCGTTGAAGTTCTT 
GACGATGTGCATCGCTCCCCTTTCGGTAACCCAATTCCTGGCCTCGGCGAAATCGGTTTG 
GATCAAGCAGATGAGCCTGATTCCGGCGTTCGTGCCATCGAT 

>RXA00319-upstream 

ACCGCAGAAGAATTAGAAAGGATCATCCATGACTGGGCCTAAAACTTCGCTACCTGTGGA 
AATTGTTTTCGTATGCACCGGAAACATTTGCCGATCCCCC 

>RXA00319 

ATGTCGGAAGTCATCGCGAAGGCAAAAGCGGAAGAAGCTGGCTTGGAAGACAACGTCATT 
TTCTCCTCCTGTGGCATGGGCAATTGGCACGTTGGCCAACCTGCTGACAAGCGAGCTCTC 
GCGGAACTGAAATCAGCCGGTTACAACGGCGACACCCACCGCGCAGCACAACTTGGTCCC 
GAGCACATGCGCGCAGATCTCTTCGTCGCGCTAGATTCCGGCCACGCCGGTGAGCT.CGCC 
GCAACGGGTGTTCCCAACGACAAAATCCGCCTCATGCGTTCCTTCGACCCAGAGTCCAAC 
CCCACCGACGATGTCGCAGACCCTTACTACGGCACATCCCAGGATTTCGTGCTCACCCGT 
GAAAACATCGAAGATGCTATGCCGGGCCTTTTGGAGTGGGTCAGAGATCACATCCGCACT 
GATTCT 

>RXA00319-downstream 
TAGGTCTTTGAGCTAAAAAGTCC 

>RXA0034 8 

ACCCGCGAGCGCCTCGAAAACGCCCAATACCAGGTACAACGCGACCGAGTCAGGGGTGCC 
ATGGAAGTCTTTATCGAAGCGGGAATCGATCCCGGCACCGTGCCGATCATGGAATGCTGG 
ATCAACAACCGCCAACACAACTTCGAAGTGGCCAAAGAACTTCTAGAAACACACCCAGAC 
CTCACCGCAGTACTCTGTACCGTCGATGCACTGGCATTCGGCGTTCTGGAATACCTTAAA 
AGCGTAGGTAAATCAGCGCCTGCAGATCTATCCCTCACTGGTTTCGATGGCACCCACATG 
GCACTCGCACGGGATCTCACCACCGTCATCCAACCCAACAAACTCAAAGGGTTCAAAGCC 
GGCGAAACACTGTTGAAAATGATTGACAAAGAATACGTGGAACCAGAAGTGGAATTGGAA 
ACTTCCTTCCACCCAGGTTCCACGGTTGCGCCAATC 

>RXA003 4 8-downstream 
TAGGCTTGTGGCACTTT TCGTGC 1500 

>RXA0035Q-upstream 

AGATTCGGTAATAAAAGGTAAAAATCAACCTGCTTAGGCGTCTTTCGCTTAAATAGCGTA 
GAATATCGGGTCGATCGCTTTTAAACACTCAGGAGGATCC 

>RXA00350 

TTGCCGGCCAAAATCACGGACACTCGTCCCACCCCAGAATCCCTTCACGCTGTTGAAGAG 
GAAACCGCAGCCGGTGCCCGCAGGATTGTTGCCACCTATTCTAAGGACTTCTTCGACGGC 
GTCACTTTGATGTGCATGCTCGGCGTTGAACCTCAGGGCCTGCGTTACACCAAGGTCGCT 
TCTGAACACGAGGAAGCTCAGCCAAAGAAGGCTACAAAGCGGACTCGTAAGGCACCAGCT 
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AAGAAGGCTGCTGCTAAGAAAACGACCAAGAAGACCACTAAGAAAACTACTAAAAAGACC 
ACCGCAAAGAAGACCACAAAGAAGTCT 

>RXA00350-downstream 
TAAGCCGGATCTTATATGGATGA 

>RXA003 63 

CGCTCACTCACCGACCAAGTCATGGATTTCGTCCGCGAATCCACCCTTGATAAAACAATG 
GTCACCGGAGAGTGGTACAGCGTTTACCAGGTCAGCGACCAATTAGGCATTTCCCGCTCC 
CCCGTCAGAGACGCGCTGCTCCGCCTGGAAGAAGCAGGGCTCATCCGCTTCACCAGGAAC 
CGCGGATTCCAAATTGTCGAAACCAAACCCTCTGATGTCGCCGAAATTTTTGCCCTTCGT 
CTAGGCATTGAACCCGCCGCAGCATACCGGGCAGCACAGCTACGCACCGAAGAACAGCTC 
CACGAAGCAGATGACATCATTGCACTCATGGCGCAAGCCGAGGCCGACAATGACGAAGAA 
GCATTTTTCACCCATGACCGGCAGTTTCACCGACAAATTATGACCATGGGACACTCCCAA 
CGCGGGGCTGACCTGGTAGAAAAACTACGCGCACACACCCGTATCCTCGGTGCTTCTACT 
GCCGGGAACAAACGCACCCTTGGCGATATTTTGGAAGAACACGAACCAATCTTGGATGCC 
ATCAAACGACAATCAGCAGAAATGGCACGAGCCACCATGCGGGAGCATATCCAAGTCACC 
GGAAAGCTACTACTAGAACAAGCAGTGGAAAAATCCGGCGAAGGAGCTGCTCAGAAGATT 
TGGGATCAGTACACGGCGGGAGTT 

>RXA003 63-downstream 
TAGGCATATTTACCTAATCAATT 




>RXAOO 4 00 -upstream 

TGTTCACAGTCAGCCACCTGAGGATGAAGACATCGATCGAATTCAGAAAAAGCTGCAGGC 
TGAGGGCTTCCCCACCCGCAATTAATTAATTGGAGTTTTG 

>RXA004 00 

TTGTTCACTCTTGAACAGTTGCGGTGTTTTGTCGCCGTCGCCAATCATCTTCATTTCGGA 
AAAGCTGCTGCAGAGCTATCCATGACGCAGCCGCCGTTGAGTCGTCAGATTCAAAAGCTG 
GAGAAGATCGTCGGTGCAACCCTGCTTGATCGTGACAACCGCAAGGTGGAACTGACCACT 
GCGGGTTTCGCATTTTTGAAGGATGCTCGCCTCATTCTCAATTCCACCGAGAAGGCGGCT 
GAGCGCGCACGATTGGCTAGCTCTGGCATGTGGGGACAGCTCAATATTGGATACACCGCT 
GCAGCGGGTTTTTCCATTCTGGGCCCGACGTTGAATCAGTTGCATGAGAAGATGCCGGGG 
GTCAGTGTCGATCTTTTTGAGATGGTCTCCACCGAGCAGATCGCCGCCTTGGAATCTGGG 
CTACTGGATCTTGGCATTGGCCGATTGAGCTCGCCAGTTGAGGGTCTTCAAACTCGACGT 
CTCCAGGCAGATTCCTTGGTTCTTGCAGCTCCGAAGGGGCATCCACTTCTTGATCAGAAT 
CGACCACTGTTGCGGAAGCATCTGACTGGGGTTCCTTTTCTGCAGCACTCTCCCACCAAG 
GCGAAGTACCTCTACGACATCGTTGTTAGAAACTTCACGATCAATGATGCGCAGGTGCAA 
CATACGCTGAGCCAGATCACCACGATGGTTAGTCTGGTGGCCTCTGGACTGGGTGTTGCG 
CTGGTTCCGGAGTCTGCGAAAAAACTCAATTACAGCGGTGTTGAGTATCGCCATTTTTAT 
GATCTACCTGTTGGTTTAGCGGAGCTGCAGGCTATTTATTCCACCTCGAATGATAATCCT 
GCGGTGCGGAAATTCATCAAAAACATTGACGATACCTTT 

>RXA0 04 00 -downstream 
TAAGCATTTCAACATGCCAAACT 

>RXA004 64 -upstream 

ATCCGCAAGAACCCATTGCACGCCTTGCTCCTCCACAATGTGAAAAGTGCGGAGGGCTGA 
TTAGACCAGGTGTG 

>RXA004 64 

GTGTGGTTTGGTGAGAACCTGCCCGTAGAAGAGTGGGATATTGCAGAGCAACGCATCGCA 
GAAGCCGATCTCATGATCATTGTGGGTACCTCCGGGATTGTTCATCCTGCAGCAGCACTC 
CCGCAATTAGCCCAACAACGCGGCGTTCCCATCGTGGAGATCTCCCCAACGCGCACCGAA 
CTTAGCCGGATCGCAGACTTCACCTGGATGTCCACCGCAGCCCAAGCGCTACCAGCGTTG 
ATGCGAGGTTTGAGCGCC 

>RXA004 64-downstream 
TAACAT.GACTGAAGATGACTTAG 
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>RXA004 94 -upstream 

TTCATCATTGCGGTCGACACTGTTGTCTCGGGCGAACACCCCACATCAGAGAACGCGATC 
AAAGCTATAAAAAGTAGCTGACAATAGGGAGTATTTGAAG 

>RXA004 94 

ATGACATTGCCTCACCAGCTTCCCGGGCCAAATGCAGACTTCTGGGACTGGCAGTTGCAC 
GGAACGTGCCGCGGCGAGACCTCCGACGTGTTCTATCACCCGGACGGCGAGCGCGGTCGT 
GCTCGCCAGCGTCGGGAGCTGCGCGCAAAGGCCATCTGTGCAGCATGCCCAGTATTGGAA 
TCCTGCCGCAAGCATGCACTAGCTGTAGCAGAGCCTTATGGAGTATGGGGCGGACTTTCA 
GAGTCCGAACGACTGGTTATCCTTCGCAACAACGAGCGCAAGCAACCAGTAGCAGTT 

>RXA004 94 -downstream 
TAAAAGAGCAGACCCGGTCACCA 

>RXA00516-upstream 

AAGCAAAAAATTGCTTGTCGACGTCTCCCCCAACCTAGCATCCACTTTCTGCAACCCAGT 
GTCACAAATAGTCTAAATTTCGGTGTACTAAGGTGTTGTC 

>RXA00516 

ATGGTCCAAAAAGATGCCCAGGCCTCCCCTGCTACGAGAAAAGCAGATCAGGTATACACA 
CAGATTCGTCGTGAAATCGAAGATGGAACCTTAAATCCTGGGCAACGAATGTCGGAAGTG 
TGGCTGGTTGAACACACCGGCGCTTCGAGAACCCCAGTCCGGGATGCTCTCCGCCGGTTA 
GCCGCAGACGAGTTGATCATTTTGGAGCCACGTCAGGCGCCTATGGTGTCGCCACTTTCG 
CTTCGCCACATTAAGGATCTGTTTGAGTTCCGCAGGATCGTCGAGGTCGCAGCGCTTGAG 
GAAATCTCTGTTGGAGCGAGTAAATCACCGCGTATCTTTGGTGAGTTTTCTACGTTGGCG 
GCAGATTTTCGAGAGCTGGAAAACTCTGCAGACGATGCAGATTTCACCGCCGATTTTAGG 
CGATTGACCAGTAAGTTTGATGATCTTGTTGCAGCAAATACTCACAACCAATTCCTTGGA 
CGCAGCATCTTAAGTTTAAAACCGCACACCACGAGGCTGCGGATCATTGCGCATTCCGAT 
CATGCGCGTCTGCGCCAATCGGTTCAGGAACATATTGAAATGTGTGAAGCTGTGGCCTCA 
GGAGATTTAAGGTCGGCAGGCGCTGCGTGTAGACAGCACCTGATCCATGTAGAAAAGAGC 
ATTTTGACCGCATTGATTAATGCTGATTCTACGGGCTCGCAGGGCATTGATATTAGGTCT 

>RXA00516-downstream 
TAGAACCAGCGTGCACTGATGGC 

>RXAO 0551 -upstream 

ACTTCTCAAGTGACGCCAAGGTAAGTTGTACTTTTTCTGTCCAAATTATTGCTTTTTCCG 
TAGATAGGTTATCGAACGGAAATTACTTGGCAATACCGCT 

>RXA00551 

ATGCTGGCAGGCATGCCTAATTTAAACGCTGAGGAGCTAGCAGTCCGCGTGCGACCCGCG 
CTGACAAAACTCTACGTTCTCTATTTCCGCCGCTCTGTGAATTCTGACCTCTCGGGTCCA 
CAGCTCACTATTTTGAGTCGCCTGGAAGAAAACGGCCCATCCCGAATTAGTCGCATCGCG 
GAACTTGAAGATATTCGTATGCCAACCGCTTCGAATGCTCTGCATCAGCTGGAGCAACTC 
AACCTGGTTGAGCGTATCCGCGACACCAAAGACCGCCGAGGCGTGCAGGTTCAGCTCACT 
GATCATGGACGCGAAGAGCTTGAGCGCGTGAACAATGAACGAAACGCA 

>RXAO 058 3-upst ream 

CTGAGCTGGCCGCAGCAGTAAGGGAACGCCGAGCAGCAGCCAAGTAJ\TTAAGGGCGCTAG 
ACTGTTAAATGTGTTAAACCTGCCCAGACTGCTGTACCCG 

>RXA00583 

GTGTATGAGCGCCGTCTTTTAAGAGAACTAGACGGCGCCAAACAGCCCGGTCACGTTGCC 
ATCATGTGTGATGGCAACCGACGCTGGGCCCGGGAAGCGGGCTTCACTGATGTCAGCCAT 
GGGCACCGAGTGGGTGCCAAAAAGATCGGCGAGATGGTCCGCTGGTGTGATGATGTAGAC 
GTCAATCTCGTGACCGTTTATTTGCTGTCTATGGAAAACCTTGGGCGATCCTCCGAAGAG 
CTGCAATTGCTGTTCGATATCATCGCCGATGTCGCTGATGAACTCGCGCGTCCTGAAACC 
AACTGTCGAGTCCGCCTCGTTGGTCATTTAGATCTGCTCCCAGACCCAGTTGCTTGTCGT 
TTACGCAAAGCTGAAGAAGCTACCGTTAACAACACAGGCATCGCAGTCAACATGGCTGTC 
GGTTATGGCGGACGCCAGGAAATCGTTGATGCCGTGCAAAAACTTCTGACCATCGGCAAG 
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GACGAGGGCCTAAGCGTTGATGAACTGATCGAATCCGTC7U\GGTAGATGCGATCTCCACT 
CACCTGTACACCTCTGGCCAACCAGACCCAGACCTGGTGATCCGCACCTCTGGTGAGCAG 
CGACTTTCCGGATTCATGCTGTGGCAATCTGCCTACTCCGAAATCTGGTTCACAGACACC 
TACTGGCCAGCCTTCCGACGCATCGACTTCCTCCGCGCCATTCGCGACTACTCGCAGCGC 
AGCAGAAGATTCGGTAAA 

>RXA0058 3-downstream 
TAACTTATTCTCCAAGGAGAGAC 

>RXA005 92-upstream 

ATACTTGGGGTGAGGACGGCTGGGAGTTGGTCTCGGTCATGCCTGGTATGAACCCTGAGA 
ACCTCGTTGCTTACATGAAGCGTGAGGTGGCTTAGTTCTT 

>RXA00592 

ATGGCTTCTAATTCCGAACGCCTTGCAGAGCTGGGCATTTCTCTTCCTTCCGTTGCAGCG 
CCTGTTGCTGCGTATGTTCCTGCGATTCAGACCGGTAACCAGGTGTGGACTTCTGGTCAG 
CTGCCTTTCGTTGATGGTCAGCTTCCGGCCACCGGCAAGGTTGGCGCTGAGGTTTCCGCT 
GAGGATGCGGAGAAGTTGGCTCGTGCGGCTGCGCTAAACGCTCTTGCTGCGATTGATGCG 
CTTGTTGGCATTGATAAGGTCACTCGCGTTTTGAAGATTGTTGGTTTCGTGGCGTCTGCT 
GATGATTTCAGTGGTCAGCCTGCTGTCGTCAACGGTGCTTCCAATTTGATGGGTGAGGTT 
TTCGGCGAGGCTGGGGCGCATGCGCGTTCTGCTGTGGGCGTGGCGGAGTTGCCGCTCAAC 
TCGCCTGTCGAGGTCGAGGTTATCGTCGAGATCGCGCAG 

>RXA005 92-downstream 
TAGCACGCTTTTCGACGCAAAAT 

>RXA005 93-upstream 

TTGGTGTGATGCATTCGACAGCAAATTGGCTGTGTGACTACACTTGCGAGTGTATTAAGT 
ATTAGGCCGTGCATATGTAGCGCATTTTAAGGAGATTGTC 

>RXA005 93 

ATGACGTCTGTGATTCCAGAGCAGCGCAACAACCCCTTTTATAGGGACAGCGCCACAATT 
GCTTCCTCGGACCACACAGAGCGTGGTGAGTGGGTCACTCAGGCAAAGTGTCGAAATGGC 
GACCCAGATGCATTGTTTGTTCGTGGTGCAGCGCAACGCCGAGCAGCAGCAATTTGCCGC 
CACTGCCCTGTAGCCATGCAGTGCTGCGCCGATGCCTTAGATAACAAGGTGGAATTCGGA 
GTCTGGGGAGGCCTGACCGAGCGCCAGCGCCGTGCATTGCTTCGAAAGAAGCCGCACATT 
ACTAACTGGGCTGAATATTTGGCTCAGGGGGGCGAGATCGCCGGGGTT 

>RXA005 93-downstream 
TAATTAATTTCAAGGGCTGGCCA 



>RXA00 603-upstream 

GAATGAATCTCTTGCGTTTTTTGCACACTACAATCATCACACAATTGCCGGGTAGTTTTG 
TTGCCAGTTTGCGCACCTCAACTAGGCTATTGTGCAATAT 

>RXA00603 

ATGAAGCTAGATTCCATTGATCGCGCAATTATTGCGGAGCTTAGCGCGAATGCGCGCATC 
TCAAATCTCGCACTGGCTGACAAGGTGCATCTCACTCCGGGACCTTGCTTGAGGAGGGTG 
CAGCGTTTGGAAGCCGAAGGAATCATTTTGGGCTACAGCGCGGACATTCACCCTGCGGTG 
ATGAATCGTGGATTTGAGGTGACCGTGGATGTCACTCTCAGCAACTTCGACCGCTCCACT 
GTAGACAATTTTGAAAGCTCCGTTGCGCAGCATGATGAAGTACTGGAGTTGCACAGGCTT 
TTTGGTTCGCCAGATTATTTTGTCCGCATCGGCGTTGCTGATTTGGAGGCGTATGAGCAA 
TTTTTATCCAGTCACATTCAAACCGTGCCAGGAATTGCAAAGATCTCATCACGTTTTGCT 
ATGAAAGTGGTGAAACCAGCTCGCCCCCAGGTG 

>RXA00 603-downstream 
TGAAGCATGCATTTTGAAGCATG 

>RXA00609-upstream 

AAGGCCCACGGTGGCCGAGCTTTCGTCAATTCCACACCAGGTCTAGGATCCATTTTCGGC 
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CTGGAAATCCCCGCACCAGAACAATCAAAGGAATACACCC 
>RXA00609 

ATGAGCAAGATCCTGCTCGCTGAAGATGACGCCGGCATCGCAGATTTCATCGTTCGTGGC 
CTCATCCGCGAAGGCTTCGAATGCGAGGTCACCGAATCCGGCGCCGAAGCTTTCGCCCGC 
GCACATTCCGGCGATTTCGATCTCATGGTTTTAGACCTCGGCCTCCCCCACATGGACGGC 
ACGGATGTCCTAGAGCAATTAAGAAATCTGCAGGTCACGCTACCTATCATTGTGCTCACG 
GCACGCACCAACATCGAGGACCGCCTCCGCACCCTCGAGGGCGGCGCCGACGATTACATG 
CCCAAACCATTCCAATTCGCAGAATTACTGGCCCGCATCAAACTCCGCCTCGCCAAACAC 
ACTCCTCAGGAAACGCCGACCGATGCGCGCGTGCTACGAAACGGCGATTTGGAGCTCGAT 
CTTCGTACCCAGCGTGTGCTCATCGACGGCTCCTGGCACGACCTTTCCCGCCGCGAAGTC 
GATCTGCTCGAAACCCTCATGCGACACCCAGGGCAAATCCTCTCCCGAGTCCAACTCCTC 
CGACTGGTGTGGGACATGGATTGGGACCCCGGCTCAAACGTGGTGGACGTATATATCCGC 
GCGTTGAGGAAGAAAATCGGTGCCCATCGGGTCGAAACCATCCGAGGATCTGGCTACCGG 
CTGCGC 

>RXA00 60 9-downstream 

T AAC TG C AG AACG AG AC C AAAAA 

>RXA00638-upstream 

AATTTCAACTCTTTCAGTACATTCACAGTTAGTATTCAGTGGTGTTGAAGTTCCAGGGTG 
TTCACTAGTGGGAAGTTAATCATTCCGCTAATGGACACCA 

>RXA00 638 

ATGGAAGAGATAAAAATGGACAACCAGTCTGACGGACAAATCCGCGTACTCGTCGTTGAT 
GACGAGCCAAACATCGTCGAGCTGCTCACCGTAAGCCTTAAATTCCAAGGCTTCGCAGTG 
ATGACCGCCAACGATGGCAATGAAGCCCTGAAGATTGCTCGTGAGTTCCGTCCAGACGCA 
TACATCCTCGATGTCATGATGCCAGGAATGGACGGCTTCGAGCTGCTGACCAAGCTGCGC 
GGCGAAGGCCTTGACAGCCCAGTTCTGTACCTCACCGCAAAGGATGCCGTGGAGCACCGC 
ATCCACGGCCTGACCATCGGCGCTGACGACTACGTGACCAAGCCTTTCTCCCTGGAAGAA 
GTAATCACCCGCCTGCGCGTGATT 

>RXA00 64 5-upstream 

CTTATCGACGAGCTCCAAGGCTGGACCGTGGTACGTGCCACTTCCCTGTCGTGGCTGAAA 
TAAAAATCCCCGAAACCTCCTTGGACACATCGCCCACAAA 

>RXA00 64 5 

TTGGGTGCGCACTCCGCCAACTCCATCCGTGGTGTGATCGACCGTCTCGATGCCTCCACC 
GTGGTGATCGTTGCCGATGTCCACTGGGCCGACGTGGAATCCATGCAAAAACTCATCGAA 
TATTCCATGCGCATGGTTTCTGGCCGTTTCGCACTCATCATGATTGGCCTTGATGAAGAG 
AACTTAGTGTTCCACGATGAGGTGGTCTCGCTCCCCTCCATCGCAGACTCCACCTACGTA 
TTGCCGCCGATGAGTATTGAAGAAATCCGCCAGCTTGCGCTTACCGATGTCCGCGGCCGC 
ATCAGCACCACCACCGCCACAGACATCCAGCGCATCACCGGCGGCATCTACGGGCGAGTC 
AAAGAAGTCCTCCACTCGGAATCCCCCGATCACTGGCGAATGCCCAACCCAAATATTCCC 
ATCCCACAAAGCTGGCATGCCAACCTGTTGAGACGCATCACCAACGAAGAAGTCTGGCAT 
GTACTACTCGCCGTCGCTGTCCTTCCCTCCGGAGGCCCCATTGACCTGGTAAAACTCATA 
GGCAACGACCCCACGGGCATGCTTTGCGACGACGCCGTCCGCTCAGGCCTGCTCCGCGTG 
CTGCCGTCTGACGGCCAACCACAAGTGGATTTGGTCCTGCCGATCGACCGCGCCGTACTG 
CAATCACGCACTCCGCTCAACATTCTGGCGCAGTTGCACCACAAGGCAGCCGAATATTAC 
GGCAAGTGGAATCAAAAAGATGCCCAACTGGAGCACGAAGCATTTGCTGCAATTGATCCA 
AATGATCCAGCAGTGCGAGCCCTAGCGCAGCGCGGATATGCGTTGGGTAGGACTGGCCAC 
TGGATGGAATCGGCACACGCCCTATCTCTTGCCGCGAACCGCACTGCACACCAAGAAGAA 
TCAAATAAGTACTTGCTGGAGTCCATCGATTCACTGATCGCCGCCGCCGATCTCCCCCAA 
GCTCGATCCAGAGCATCCACCCTTGATCTTGGAGAAACCGGCATTCAACAAGACTCAATG 
CTGGGCTACCTGGCAATCCACGAAGGCCGGCGCCTCGAAGCACGCAATCTCCTTCATCGT 
GCTTCTGAAGAATTGCTGGCGCAGCACCCGATTGATCCGATCCACGGCCCCCGCATGGCT 
CAGCGCAAAGTACTGTTAAACTTAGTGGACTGGAATCCAGAAGAACTCCTGGTGTGGGCT 
GATAGAGCAGTCGCATGGACTGAAGAGGATGCTGGCGAAAAGGTTGAGGCCCAAGCTATT 
TCCCTCATTGGACAATCCATCCTCGATGGCTGCCTCCCCGAAGATAAACCCATCCCCGGT 
GAAACCACCCTTCACGCACAACGCCGCCACATGGCAATGGGCTGGCTTTCCATGGTTCAC 
GATGATCCAGTAACTGCACGTCAAAAGCTTGAACGTCGCACATCCATCAATGGTTCAGAA 



Appendix A, page 10 



# 



ley Docket No.: BGI-123CP 



CGCATCAGTTTGTGGCAAGACGGATGGCTGGCTCGGTCCCTACTGCTGCTCGGCGAATGG 
GAGTCCGCAGCACGCACCGTAGAAATCGGTCTGGCCCGCGCCGAACAGTTTGGCATCCGC 
TTCCTCGAACCACTGTTACTGTGGTCGGGCGCCACAATTGCAACAGCCCGCGGAAACTCT 
GACTTGGCACGAAATTACATGAGCAGACTGTCCACCGATCAAGACTCCTTCATCGTCCAA 
TCTATGCCATCTGCGATGTGTCGCATGTGGGTCCACCGCCATAGAAATGAAATCCCCGGT 
GCGATCGTGGCCGGAGAACAATTGGAAAAAATCGCCGCACACAAACACGTCAACGCACCT 
GGATTCTGGCCATGGCAAGACGTCCACGCAACGCATCTCATCCGCATCGGCGAAACTGAG 
CGCGCCCAGGAGTTAGTGAACTCCACGCTTGAGGAGCTCAGAGGCTCCGATATCATGTCT 
GCCCACGCAAAAATTGCCGTTCCCGACGCCATGTTGATGATCCACCACGGAGATGTGAAA 
AAGGGATTTAAGCGTTTCGACGACGCCCTCGATATGATCGATCCCCTCACCCTCCCCTAC 
TATCGGGCACGCATCTGCTTTGAATACGGCCAGGCCCTGAGACGCCAGGGGCAACGTCGA 
CGTGCTGATGAACAATTTGCCCGTGCAGCTTCCCTATTCCAAGACATGGGCGCCGACGCG 
ATGGTCACCCTAGCCAACCGAGAACGCCGGGTGGGTGGCCTTGGTCAACGATCCGAGCAA 
GCCGGTGGGCTCACCCCTCAGGAATATGAAATTGCCCGATTAGTGTCATCTGGGCATGCC 
AACCGAGAGGTCGCACAGGAGCTTTTCCTCTCGCCTAAGACCGTGGAATAC 

>RXA00 651-upstream 

GGCTGCCTCGGTGGTGGTCTCTGGGGTTGCTTCAGGTTCCGCCGGGGTACAAGCGGTGAG 
CATGATGGAAGCAGCGAGGATAGTAGGTAATGTACGACGC 

>RXA00651 

ATGCAGTCAAGCCTAGATCGTGTGTCGGAAACCGGACGCAATGAGCTCGATGTTGAAACC 
CTTGTGAAGAAGGGGAATCAACCGGGCGCGATGAGCTATCGCAACAGTATCCACATTTTG 
ACAGCCTCGCTGCTGGTCGTGGGGTTGGGAGCTTCCGCCCGCCTGACGCTGCCGATGTTT 
GCGCTGTCGTGCGTGCTGTTGTTTGTGTGGGGTTTTCTGTACTTCTATGGATCAACCAAA 
CGCGTAGATTTGAGCCACGGCATGCAGCTGGGCTGGCTGTTTGTGCTGACGCTGGTGTGG 
ATTTTTATGGTGCCGATCGTGCCCGTGTCCATTTATCTGCTGTTCCCGCTGTTTTTCCTC 
TATCTACAGGTGATGCCTGACGTGAGAGGCATTATTGCGATTTTGGGTGCGACAGCGATT 
GCGATTGCCAGCCAGTATTCCGTGGGGTTGACCTTTGGTGGTGTGATGGGTCCGGTGGTC 
TCTGCGATCGTGACCGTGGCTATTGATTACGCGTTCCGCACGTTGTGGCGGGTGAATAAT 
GAAAAGCAGGAATTGATTGATCAGTTGATTGAAACTCGCTCCCAGCTGGCGGTGACGGAA 
CGAAATGCGGGTATTGCTGCGGAACGTCAACGTATTGCGCATGAAATTCATGACACGGTC 
GCCCAGGGACTCTCCTCCATTCAAATGCTGCTGCATGTCTCTGAACAGGAGATTCTCGTT 
GCTGAGATGGAAGAGAAGCCAAAGGAGGCGATCGTGAAGAAGATGCGCCTTGCCCGACAA 
ACAGCCTCCGACAATCTCAGTGAGGCTCGCGCGATGATTGCGGCGTTGCAACCGGCAGCG 
CTGTCTAAAACCTCCTTGGAAGCAGCACTTCACCGCGTCACAGAACCGTTGTTGGGTATT 
AATTTTGTGATTTCTGTCGACGGTGATGTTCGCCAACTGCCCATGAAAACTGAAGCCACC 
CTTCTGCGAATTGCTCAAGGTGCGATCGGAAATGTGGCGAAACATTCAGAGGCGAAAAAC 
TGCCACGTGACACTAACCTACGAAGACACAGAAGTACGCCTTGATGTGGTTGATGACGGT 
GTGGGTTTTGAGCCTTCGGAAGTGTCCAGTACCCCCGCTGGCCTTGGCCATATCGGCTTA 
ACCGCATTGCAGCAGCGTGCGATGGAATTGCACGGCGAAGTTATAGTGGAATCTGCATAT 
GGGCAGGGTACTGCGGTATCTGCAGCATTGCCGGTGGAGCCACCAGAGGGGTTTGTCGGG 
GCGCCGGTTTTGGCAGATTCGGACTCAAGTGCTACAGGCGAGGTTGAACTAAGTTCTCCA 
ACTGACGATGAG 

>RXA00 651-downstream 
TAAGGCTAGACTAAAGTACGATT 

>RXA00 655-upstream 

TTGCCTGTGGATTAAAACTATACGAACCGGTTTGTCTATATTGGTGTTAGACAGTTCGTC 
GTATCTTGAAACAGACCAACCCGAAAGGACGTGGCCGAAC 

>RXA00655 

GTGGCTGCTAGCGCTTCAGGCAAGAGTAAAACAAGTGCCGGGGCAAACCGTCGTCGCAAT 
CGACCAAGCCCCCGACAGCGTCTCCTCGATAGCGCAACCAACCTTTTCACCACAGAAGGT 
ATTCGCGTCATCGGTATTGATCGTATCCTCCGTGAAGCTGACGTGGCGAAGGCGAGCCTC 
TATTCCCTTTTCGGATCGAAGGACGCCTTGGTTATTGCATACCTGGAGAACCTCGATCAG 
CTGTGGCGTGAAGCGTGGCGTGAGCGCACCGTCGGTATGAAGGATCCGGAAGATAAAATC 
ATCGCGTTCTTTGATCAGTGCATTGAGGAAGAACCAGAAAAAGATTTCCGCGGCTCGCAC 
TTTCAGAATGCGGCTAGTGAGTACCCTCGCCCCGAAACTGATAGCGAAAAGGGCATTGTT 
GCAGCAGTGTTAGAGCACCGCGAGTGGTGTCATAAGACTCTGACTGATTTGCTCACTGAG 
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AAGAACGGCTACCCAGGCACCACCCAGGCGAATCAGCTGTTGGTGTTCCTTGATGGTGGA 
CTTGCTGGATCTCGATTGGTCCACAACATCAGTCCTCTTGAGACGGCTCGCGATTTGGCT 
CGGCAGTTGTTGTCGGCTCCACCTGCGGACTACTCAATT 

>RXAOO 65 5 -downstream 
TAGTTTCTTCATTTTCCGAAGGG 

>RXA00813-upstream 

TCCAGCGGCTGGCTAAGTCCGTGGAGATGCATGGGCTGACCGGTTCTTTGCCGAGGGTTT 
TAAGCTCAGCATGCGACGCGGTCCTCGGGGAGGTGGCGGC 

>RXA00813 

ATGACTGACATTGATCTGGTGGTGGAAAACGTCCAAAGGATTATCGCCACCAAAGAGACA 
CCGCCGACCTCTGCGGAAATAGCGAGCCTGATTCGGGAACAAGCAGGCGTGATCAGTAAC 
GAGGACATCGTGATGGTGTTGCGTCGACTGCGCAGTGATTCTGTGGGCGTGGGACCGTTG 
GAATCTCTGCTTGCGCTTCCTGGCGTGACGGATGTGTTGGTTAATGCCCATGACAGCGTG 
TGGATTGATCGCGGTCAGGGCGTGGAGAAAGTCGACATGGATCTGGGCTCAGAGGAGGCG 
GTGCGTCGCCTTGCCACCCGGTTGGCGTTGACCTGTGGCAGACGCTTAGATGATGCGCAG 
CCTTTCGCTGATGGCCGAATCACCAGGGACGACGGCAGCGTGTTGCGCATTCACGCGGTG 
TTGGCACCCTTGGCGGAATCCGGCACGTGCATCAGTGTGCGAGTACTGCGTCAAGCACGG 
CTGAGCCTTGATGATCTTATCCAAAGCGGCACGGTGCCTGAGGACATCGCGCCTGCGCTC 
CGGAACATCATCAATCAACGGCGCTCGTTCCTTGTTGTCGGTGGCACCGGCACAGGGAAA 
ACCACATTGCTGTCCGCGATGCTCACCGAAGTTCCCGCTGATCAACGAATCATCTGCATC 
GAGGACACCGCAGAGCTTCATCCCGGCCATCCAAGCACCATCAACTTGGTGTCTCGCCAA 
GCAAACGTCGAGGGCGCCGGCGCCGTGAGCATGGCGGATTTGTTGAAACAATCGCTGCGC 
ATGAGGCCTGACCGGATTGTCGTCGGAGAGATTCGCGGTGCGGAAGTCGTGGATCTTTTG 
GCTGCGATGAATACCGGACACGACGGCGGTGCTGGCACCATTCACGCGAACTCCATCTCT 
GAAGTTCCCGCGCGCATGGAAGCTCTTGCGGCGACCGGCGGATTGGACCGCATGGCATTG 
CATTCTCAACTCGCGGCCGCAGTGGACATTGTGCTGGTCATGAAACACACCCCTTTTGGC 
CGCAGGCTAGCTCAACTCGGGGTGCTCCGCGGAAATCCTGTGACCACGCAGGTGGTGTGG 
GATTTGGACCACGGCATGCACGAAGGGAGCGAAGAGGCATGGTTTATGCCC 

>RXA00813-downstream 
TAGGCCTTCTTAGCGTGGCGGTG 

>RXAOO 8 22 -upstream 

CAGTTAGGTGTCATCCGGATTTTATCTCAAACCCTAACACCCCAGGTGTTGCCACTCATC 
CGGACTCAAACAAGATGTGTGCAGATGAAGGAGAAAAGCA 

>RXA00822 

GTGGAAGGTGTACAGGAGATCCTGTCGCGCGCCGGAATTTTTCAAGGCGTTGACCCAACG 
GCAGTCAATAACCTCATCCAGGATATGGAGACCGTTCGCTTCCCACGCGGAGCAACCATC 
TTCGACGAGGGCGAGCCAGGTGACCGCCTTTACATCATCACCTCCGGCAAAGTGAAGCTT 
GCGCGCCACGCACCGGACGGCCGCGAAAACCTGCTGACCATCATGGGTCCTTCCGACATG 
TTCGGTGAGCTCTCCATCTTCGACCCAGGCCCACGCACCTCCTCTGCAGTGTGTGTCACC 
GAAGTTCATGCAGCAACCATGAACTCTGACATGCTGCGCAACTGGGTAGCTGACCACCCA 
GCTATCGCTGAGCAGCTCCTGCGCGTTCTGGCTCGTCGTCTGCGTCGCACCAACGCTTCC 
CTGGCTGACCTCATCTTCACCGACGTCCCAGGCCGCGTTGCTAAGACCCTTCTGCAGCTG 
GCTAACCGCTTCGGCACCCAAGAAGCTGGCGCGCTGCGCGTGAACCACGACCTCACTCAG 
GAAGAAATCGCACAGCTCGTCGGTGCTTCCCGTGAAACTGTGAATAAGGCTCTTGCAACG 
TTCGCACACCGTGGCTGGATCCGCCTCGAGGGCAAGTCCGTCCTCATTGTGGACACCGAG 
CATTTGGCACGTCGCGCTCGA 

>RXAOO 822 -downstream 
TAATCACCAT^GCGCTAAAAAGC 

>RXA008 39-upstream 

CACAATGTGCGGAACGGGCGATAATGCGAGTACAGTGATACCGTTCTAAAACAATGGACT 
CGTTTTACAAGTCCTCCATACTTCTTTATCCGGCAGGAGA 

>RXA00839 
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ATGCCCCCAGTGACCCACCCAGAGTTTCGTAACGTAGCGATTGTCGCGCACGTTGACCAC 
GGAAAGACCACACTCGTTAATGCCATGCTTGAACAGTCTGGCGTATTCAGTGACCACGGT 
GAAGTAGCCGACCGTGTGATGGACTCCGGTGACCTGGAAAAGGAAAAGGGCATCACCATC 
CTTGCCAAGAACACCGCGATTCGTCGTA/^GGCGCTGGCAAGGACGGCAATGACCTGATT 
ATCAACGTCATTGACACCCCAGGCCACGCTGACTTCGGTGGCGAAGTTGAGCGCGCACTG 
TCCATGGTTGACGGCGTTGTCCTTCTGGTTGACGCATCTGAAGGCCCACTGCCTCAGACC 
CGATTCGTG 

>RXA00845 

TCCTCCTTCCTTGGTCGTATCGGCCTGGTTCGCGTTCACGCAGGTACCTTGCGTAAGGGC 
CAGCAGGTTGCATGGATTCACTACGATGAAGAAGGTAACCAGCACACCAAGACCGCTAAG 
ATCGCAGAGCTTCTGGCTACCGTTGGCGTTGCCCGCGTTCCTGCTACCGAAGTTGTTGCA 
GGTGACATCGCTGCTATCTCCGGCATCGAAGACATCATGATTGGCGATACCCTCGCGGAT 
CCTGAGAACCCAGTTGCACTGCCTCGCATCACCGTTGATGAGCCAGCACTGTCCATGACC 
ATCGGTGTGAACACCTCACCAATGGCTGGTCGTGGCGGCGGAGACAAGCTGACCGCACGT 
GTGGTCAAGGCTCGTCTTGAGAACGAACTGATCGGTAACGTGTCCCTGAAGGTCAACCCA 
ACTGAGCGCCCAGATACCTGGGAAGTTCAGGGTCGTGGCGAAATGGCTCTGTCCATCCTC 
GTTGAGACCATGCGTCGCGAAGGCTTCGAGCTCACCGTTGGTAAGCCACAGGTTGTTACC 
CAGACCATCGACGGCAAGCTGCACGAGCCTTACGAGATCATCGTCATCGACGTTCCTTCC 
GAGTACCAGGGCAACGTGACCCAGCTGCTGGCTACCCGCAAGGGCCTCATGCAGTCCATG 
TCCACCACCCCAGGTTCCGACTGGATCCGCATGGAATTCCGTATTCCTGCTCGTGGCCTG 
ATTGGTTTCCGTACCCAGTTCATGACTGAAACCCGTGGTACCGGTATCGCTAACTCCTAC 
TCTGACGGCATGGATGTTTGGGCTGGCGAAATCAAGGGCCGCGCACACGGTTCCTTGGTT 
GCTGACCGTTCCGGCCAGATCACCGCTTACGCTCTGACCCAGCTGGCAGACCGTGGTAGC 
TTC 

>RXA0084 9-upstream 

GCAAAGCTTTCGCCTGCTGATTGACCATATTGAGTCGCAGTGACTCAAGTTTCCAGGTAA 
ACTGGGAACAAATTTTAGGGAAAGGGAGTTGAACCTAACG 

>RXA008 4 9 

ATGGTTACTTATACAACCCTTCTAGACAAGCCGATTTCAGAATCTGCCCCACGGAAAGCT 
CCAGAGCCACTTCTCCGCGAAGCTCTGGGTGCAGCTCTTCGTTCTTTCCGTGCTGACAAG 
GGCGTTACTTTGCGTGAGCTGGCGGAAGCTTCACGTGTGTCACCTGGTTATCTTTCAGAA 
TTGGAACGCGGCCGCAAAGAGGTGTCCTCTGAGCTTCTTGCCTCCGTGTGCCACGCTTTG 
GGGGCCAGCGTTGCGGATGTGTTGATCGAAGCTGCAGGTTCCATGGCGCTGCAAGCAGCG 
CAGGAAGACCTCGCTCGCGTC 

>RXA0084 9-downstream 
TAAGCGCATGGGTGGGCGTCGAA 

>RXA008 8 5-upstreain 

CAACCTGGCGGTGACCGATGCGGGACGTTTGCTTGCCGACGGCATCATCGCCGACATTTT 
GCTTAGTGAAGAAGACTAAATATTTAGTAGGGTTACAGAC 

>RXA00885 

ATGGTGAGTGCAACAGAGAAACGTAGATACGAAGTGTTGCGGGCCATCGTCGCTGATTAC 
ATTGCGTCTCAGGAACCTGTCGGATCGAAGTCACTCCTCGAGCGCCATAAGCTCAACGTG 
AGTTCTGCGACGATCCGCAACGATATGTCGGTGCTGGAATCCGATGGCTTTATCGTCCAG 
GAGCATGCAAGTTCTGGCCGGGTACCAACCGAAAGGGGTTACCGCCTTTTTGTTGATTCC 
ATCCATGACATCAAACCGCTGTCGCTGGCGGAACGGCGCGCTATTTTGGGCTTCCTTGAA 
GGGGGAGTGGACTTAGAGGACGTGCTGCGCAGATCTGTGCAGCTGTTGTCTCAGCTCACC 
CATCAGGCTGCCGTGGTGCAGCTGCCCACCCTGT^AAACAGCGCGCGTGAAGCACTGCGAG 
GTGGTGCCGCTGTCGCCGATGCGCTTGCTGCTGGTGCTCATTACCGATACTGGCCGTGTA 
GATCAGCGCAACGTGGAACTTGAGGAACCGCTGGCGGCGGAAGAAGTTAATGTGCTGCGC 
GATCTGCTCAACGGCGCGCTAGGGGAGAAAACGCTGACGGCTGCATCAGATGCGCTGGAA 
GAGTTGGCTCAGCAAGCCCCAACCGATATTCGTGATGCCATGCGCCGCTGCTGCGATGTG 
CTGGTGAACACGCTTGTCGATCAACCCTCTGACCGCCTGATCCTCGCCGGCACCTCAAAC 
CTCACCCGCTTAAGCCGGGAAACCTCCGCGAGCCTGCCCATGGTTTTAGAAGCCTTGGAA 
GAGCAGGTGGTCATGTTGAAACTGCTGTCCAATGTCACTGATCTTGACCAAGTGCGCGTG 
CATATTGGCGGCGAAAATGAAGACATTGAGCTGCGCAGCGCAACGGTGATTACCACCGGT 
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TACGGCTCCCAGGGCAGCGCACTGGGCGGATTGGGGGTGGTTGGCCCCACCTATATGGAC 
TACTCGGGAACAATTTCTAAGGTGTCCGCCGTTGCTAAGTATGTTGGTCGTGTGCTCGCT 
GGCGAA 

>RXA008 85-ciownstream 
TAGCTGCGGGTATAGTTGGCCAT 

>RXA008 94 -upstream 

GATGCCCTCGTTTTGGTGCTGGTCCAAGACGTAGCGCTGTAGTTTTCTCGGATTTGTTGG 
AACTAGTCGTAATTAGTCGGATTTTAAGGAGGCTCACGCC 

>RXA008 94 

ATGGGCATTGAGTTTAAGCGTTCACCGCGACCCACCCTGGGCGTTGAGTGGGAAATTGCA 
CTTGTTGATCCAGAAACACGTGATCTAGCCCCGCGCGCTGCAGAAATACTAGAGATTGTG 
GCCAAGAACCACCCTGAGGTGCACCTCGAGCGCGAATTCCTCCAAAACACCGTGGAGCTT 
GTCACCGGAGTGTGCGACACCGTCCCCGAAGCGGTGGCAGAGCTTTCCCACGATCTAGAT 
GCGCTGAAAGAAGCAGCGGATTCTCTCGGGCTTCGGTTGTGGACCTCTGGATCCCACCCA 
TTTTCGGATTTCCGCGAAAACCCAGTATCTGAAAAAGGCTCCTACGACGAGATCATCGCG 
CGCACCCAATACTGGGGAAACCAGATGTTGATTTGGGGCATTCACGTCCACGTGGGCATC 
AGCCATGAAGATCGCGTGTGGCCGATCATCAATGCGCTGCTGACAAATTACCCACATCTG 
TTGGCACTTTCTGCAAGCTCTCCAGCATGGGACGGACTTGATACCGGTTATGCCTCCAAC 
CGGACGATGCTCTACCAACAGCTGCCTACAGCCGGACTGCCATACCAATTCCAAAGCTGG 
GATGAATGGTGCAGCTACATGGCGGATCAAGATAAATCCGGTGTCATCAACCACACCGGA 
TCCATGCACTTTGATATCCGCCCCGCATCCAAATGGGGAACCATCGAAGTCCGCGTGGCC 
GATTCTACCTCCAACCTGCGGGAACTGTCTGCCATCGTGGCGTTGACCCACTGTCTCGTG 
GTGCACTACGACCGCATGATCGACGCTGGCGAAGAGCTTCCCTCCCTGCAACAATGGCAC 
GTTTCGGAAAATAAATGGCGCGCGGCTAGGTATGGTCTGGATGCCGAAATCATCATTTCC 
AGAGACACCGATGAAGCGATGGTTCAAGACGAACTCCGCCGACTAGTAGCGCAATTGATG 
CCTCTAGCCAACGAACTCGGCTGCGCTCGTGAGCTTGAACTTGTGTTGGAAATCCTGGAA 
CGTGGTGGTGGATACGAACGCCAACGCAGAGTGTTTAAAGAAACTGGCAGTTGGAAAGCT 
GCAGTTGATTTAGCCTGCGACGAACTCAACGACCTCAAAGCACTGGAC 

>RXA008 94 -downstream 
TAAATAGCTATGGTGGAATCCCA 

>RXAO 0 94 7 -upstream 

CAACGAGTTTGATCGTACAACTAGCTTGGCTTGAGGGCAAGGTACGATTAGAGTCGTACC 
T AAAT T C AAT AC T G T T T T AC GAG AT C AT T T G G AG GG T G T C 

>RXA0094 7 

ATGGCCCGCAAATTGGAACATCCATCTTTGGCCGAGATGAATTTAAATGCCATCATGTTT 
GCGCTGTCGGATCCTATTAGGCGACAAATCCTGTCGCAGCTGTCGTGCGGACATAATGAT 
CAGGCATGTGTTGCCTTCGAGCTTCCAGTATCTAAATCCACCTCAACGCACCACTTCCGC 
GTACTCCGTGAGGCGGGTCTGATTACTCAGCGCTATGAAGGAACTGCCATTCTAAGTGCG 
CTGCGCAGCGAAGATATGGAAGCGCGTTTTCCGGGACTGCTGACTTCTGTCATGCGAGCG 
GAAGTGGAAGAGCGCAACGCAGCTGACTTGCCCGTT 

>RXA0094 7-downstream 
TAGGACGGTTAGCAAGTATTATC 

>RXAO 100 1-upst ream 

TGGGACACCGTGGGCATGCTGCTCATCGTCGTGGTTGTCGCAACGATGATCGTCGATCTC 
ATCTCCGGCACCATCCGCCGCCGCATCATGAAGGGGGCTA 

>RXA01001 

GTGACCGTGTCGTGGCACCAAGCAACTGACGCTCCACCAAGCATCCGCATCACCACGCTT 
GCGCCATCGCTGCAGCCTAATCAGCGCAAAGTCGCCGAAGTCATGCTTGTCGACGCCCCC 
AGCATCGTCGAACTGACCGCTCAGGGCCTTGCAGATCGCGTGGGGGTTGGGCGTGCCACC 
GTCATCCGCACCGCCCAGTCCTTAGGCTACGACGGATTCCCGCAGCTGCGCGTCGCCCTG 
GCGCAGGAACTGGCACTGGCGCAGGGCGCGTCGAGAAGCATGGTTGAAGGAGCGTTAAGC 
TCCTCGTTGCTTGGTCAT 
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>RXA01065 

ATCCTTGAAGCAGTCCGCAAGGTATCGCCTAAAACTCCTATCCTCGGCATCATCACCAAA 
GCAGACAGCGTCTCACGTGACTTGGTTGCGGCCCAACTGATGGCTGTCCATGAGCTGCTC 
GGCGGAAACAGCGAGGTAGTCCCAGTGTCTTCCACCTCGGGGGAAAACGTCGAAACGCTT 
ATTAAGGTCATGACCGACCTGCTGCCTGAAGGCCCCAAGTTCTACCCGGATGATCACATC 
ACCGATGAGGACACCAACACCCGCATCGCGGAAGCCATCCGCGAAGCAGCACTGTCTGGC 
TTGAAGAACGAACTGCCGCACTCCGTCGCAGTTGAGGTTGATGAAATCCTGCCAGACCCA 
GAACGCAACGGTGTCCTGGCTGTGCACGCCATCATCTACGTCGAGCGTGTTGGTCAGAAA 
GACATCATCGTCGGACACAAGGGACAGCGCCTGGGGCGCATCATCCACACCTCACGCCAA 
GACATCATCAAGATCCTCGGCCAAAACGTATTCCTTGACCTGCGCATCAAGGTGCTGAAG 
AACTGGCAATCCGATCCAAAGGCTTTGAACCGCCTGGGCTTC 

>RXA010 65-downstream 
TAGCTTTAAGGGGGTGAGTTCAT 

>RXA01110-upstream 

ATCATCGACATGCTCCGGAAAACTTAAAAATTCCCGGACGGTTCACGCAGATTACCCTAG 
CAAAGCAATCTAGCTGACGACCCAATATAGTCCTGTCATT 

>RXA01110 

ATGCTGGCAATTGTGCAGCTATCAAAAGAATCTATTATTGGCGCAGCCGTTTCGATCTTG 
AGCGAATTCGGTTTGTCGGATATGACCATGCGCCGCGTCGCAAAGCAGTTAAATGTCGCG 
CCGGGCGCGCTGTATTGGCATTTTAAAAATAAGCAGGAGCTTATCGACGCCACCTCGCGC 
TATCTCCTCGCGCCTGTCTTGGGGCGCAACGACGAGCAGCGAGCAAGCATTTCCGCGCAG 
GAAACCTGCGCGGAAATGCGTTCGCTGATGATGCAAACCAAAGATGGTGCGGAAGTCATC 
AGTGCCGCACTGAGTAATCAGCAATTGCGCCAAGAATTGGAATCACTCATTTCTGACTCT 
TTAAAGGAACCTAATGAGGTTGGCGCTTTTACGCTGCTACATTTTGTGGTGGGTGCAGTA 
TTAACAGAACAAACTCAGCTGCAGATGCACGAGTTCACGGCTGGCGCGGGAGATGACACG 
CAAGAAAACCCTGCCGATGCAAACTTTGAGGAGAGATTCAATCAAGGAATAGAAATCATT 
CTGGTGGGTCTAGACGCGCTTGGGCATATAAGA 

>RXA01110-downstream 
TGACGTTCCATGACATCAACGAT 

>RXA01118-upstream 

AGTGGAAGAAATGCTCTTGTTAATCATGTGAGACATGCTAACGTAATGTTCATCATATGC 
ACAAGGGTTCGCAATGCGAACAAAAAGAGGAGTTGATGGG 

>RXA01118 

ATGGTCGAACAATCGCCAGATTTCGTACAATCATTTGCCCGCGGCTTATCTGTGATCCGA 
AGTTTCAGCGCAGATAATCCATCGCAAACACTGTCCGAAGTCGCCAGCCAAACTGGACTC 
TCAAGGGCCACCGCTAGGCGCTTTCTCCACACCTTGACCGACCTTGGATATGCGGTAAAC 
AACGATTCCCGGTTCCAGCTCACACCACGTGTTTTGGAGCTTGGAGCAAGCTACCTTTCC 
GCATTGTCCCTGCCTGCGATCGCGCAGCCCCGCCTGGAGGTACTCTCCCGCCAGGTCGGC 
GAATCAAGCTCCATGTCCGTACTCGACGGCACTGACATCATCTACGTTTGCCGCGTTCCG 
GTGCGCCGCATCATGACGGTGAACATCACCATCGGCACCCGTTTCCCTGCGTACGCCACC 
TCCATGGGACGCATCATGCTGGCCAACCTTCCCGAAGAAGAATTAGATGAAATGCTGGCG 
GCGGCACCCCCTGAACAGTTGACCACCCGGTCACTGACCTCCATCGCCTCAATCCGGGAA 
GAGATCATTGCTACCCGCGAAAGGGGGTGGTCATTGGTGGATCAGGAGCTCGAGCCGGGC 
CTGCGTTCGCTCGCGGCGCCGATCACCAATGCCCAGGGCGAAGTGGTTGCTTCCATCAAT 
GTGTCGACCCAATCGGCATCACATTCGGTGGAAGATATCCGCAAGCTGGTGCTGCCGCAG 
CTTTTAGAAACGGCTCAAGCAATTTCGACAGATCTCTCTGCACTC 

■ >RXA01118-downstream 
T AAAT T AAGG AT C AAAAAAT G AA 

>RXA01125-upstream 

ATCGGATTCTCCACGGAACTGGAATCGGTGATCTAA 
>RXA01125 
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ATGGCCATCATCGTCGACATCGATGTCATGCTCGCCCGCCGCAAAATGGGCGTCGGCGAA 
CTCGCCGAAAAAATCGGCATCACCCCAGCCAACCTCTCGGTCTTAAAAAACGGGCGCGCA 
AAAGCGATACGCTTCAGCACATTGGAGGCCATTTGCCGCGAGCTCGGCTGCCAGCCGGGC 
GACATTTTGCGTTACGACGCCTCCCTCCACAAC 

>RXA01125-downstream 
TAAACCCGAAACGCAAAGAGCCC 

>RXA01211-upstream 

GACAGCTGCAGCGGTGTGGGCGGCGAACCGCTACATGCGCTGGGACTCGTACCGCTAAGC 
CTGCAGCCGACGGGATT/^GGCAGCTAACATTGAGACACG 

>RXA01211 

ATGAATAAAGATTTCTGGACCGCAGGCTGGACCGCCCGCTGGTTTTCGCGCGGGGTTTCC 
CTTTTGGCCAGCCCAGTTACCGCCCCACTGAACTCTTGGCGGAGATTGCCTAACTTGGCC 
AAGTACACCCTCTACACCAGGGTGTCGTTGCAAGCGATCCCCGTGGTGTTGCTGTCGGCG 
TATTTCCTGGGCATCGTAGCTAATGCAGGCACCCTGAATCCCTCATTTGTGTGGCTGCTG 
GGTTTCTCGGTCATCCTTTTAATAGTGACGGTATTGGTTTATGAATATCAGCCATCGCTG 
AATTCTCATCCTAGGCGCAGCGTACAGCCGTTCTTCTTCACCGGGTTGGTGCTCAACGTT 
TTAGGCGTTGTGGTGTCTGTGGTGCTTCAAATTCCGGGCTTAAACATGTCGGACAACACC 
CGAGCAACTGCCCTTATTTTCACTCTTACCTGCGTATTTCTGCTTTCGATCGCCTACATT 
CCGTGGATGAATTACCGATGGGTTTGGCTGATCGCAATGTCTGCAGTGTTGTGGTGGACC 
AGCACAACGACTGATTATTTAAGTGCATTGTGGGTGGTTATCCCGCCACTCATGGCAGGA 
ACCGTCCGACTTTCCGTATGGACCGTCGATGTCATGAAAGAGGTTGAGCGTTCCCGCGAA 
TTGGAAGCCTCCCTCCGCGTCACCGAAGAACGCCTTCGTTTCGCCeAGGAACTCCACGAC 
ACTTTAGGACAACACCTGGCGGCAATGTCCGTGAAATCAGAACTGGCGCTTGCCCTGGCG 
AAACGCGGCGACGAC 

>RXA01241-upstream 

GATGCTGCAGGACTTCAATCCACTTCTTCCGTTGCTTACCAGCTTAAAGAGCTAGAGAAG 
AAGGGCTTCCTGCGCATGGGACCCTAATAAGCCTCGCGCG 

>RXA012 41 

GTGGATGTTCGCCACCTTCCAGAAACTGAAAGCCGTTCCTCCAAGGCTGCTACACAGGCA 
AAGAGCAAGGCCCCTCAGGCCGGGGTCCATGATCCTGAGTTAGCTGGCCAGACCTCATTT 
GTCCCAGTGGTGGGCAAAATTGCCGCTGGTAGCCCGATCACCGCTGAGCAGAACATCGAA 
GAGTACTACCCACTCCCCGCAGAAATCGTCGGAGACGGTGACTTGTTCATGCTCCAGGTT 
GTTGGCGAGTCCATGAGGGATGCTGGCATCCTCACCGGCGACTGGGTTGTTGTTCGTTCC 
CAGCCGGTAGCTGAGCAGGGCGAGTTCGTCGCGGCAATGATTGACGGTGAAGCCACCGTG 
AAGGAATTCCACAAGGATTCATCTGGCATCTGGCTCCTGCCACACAACGATACGTTTGCC 
CCAATTCCTGCTGAGAATGCAGAAATCATGGGCAAGGTTGTTTCCGTGATGCGCAAGCTT 

>RXA012 41-downstream 
TAAGTCGCTTTTCAGGTTCCCGC 

>RXA0124 8-upstream 

ACGGCGGTCGAAGGGTGATGTTTGAAGATTTGATGAAGATTTCCCCGCCGGGCACTGAAC 
GAGGGTGGTATTCCCCCCGGCCGAAGCGATACTGGGGTCT 

>RXA012 4 8 

ATGGCTGACCGCACACCGACCACCGCCACGCCCCCGGGGCGGGTGCTGGTCGTCGATGAT 
GAACAACCCCTGGCTCAGATGGTGGCCTCCTACCTCATCCGGGCCGGCTTCGACACCCGC 
CAGGCGCACACCGGCACCCAGGCCGTGGACGAGGCCCGTCGCTTTTCCCCCGATGTTGTG 
GTGCTGGATCTGGGGCTGCCCGAACTCGACGGCCTGGAGGTGTGCCGACGGATCCGCACC 
TTCTCGGACTGCTACATCCTCATGCTCACCGCGCGTGGCAGCGAGGACGACAAGATCAGC 
GGTTTGACCCTGGGGGCGGATGACTACATCACCAAACCTTTTAGCATCCGGGAACTGGTG 
ACCCGGGTGCATGCGGTGCTGCGCCGCCCGCGCACCAGCACCACCCCACCGCAGGTGACC 
ACCCCCTTG 

>RXAO 127 2-upst ream 

TATGGCTACGGAAATTACGGCTACGGCGACACCTCCAAAATCAATGCCCCTAAGCCCGAC 
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AACACCGAACTAACCACCACCGATGCTTCCAAGGCCAACA 
>RXA01272 

ATGAGCAATAGCTTCACTATTCTCACTGTCTGTACTGGA/yVCATTTGCCGCTCCCCGTTA 
GCTAAGCAGCTACTTGAACTTGAGCTTCCGGGGGCAGATATAATCCGCGTTGATTCCGCC 
GGTGTTCAGGCGATGGTTGATTCGCCTATGCCGGAGCAATCTTTAGAAATCGCACGTAAA 
CAGGGCATAGAAAACCCTGAGGAGCACCGAGCTAAGCAGATTACTGAGGAGCTTGTAAAC 
CAATCTGATCTGATTCTTGCGATGGATCGGGGGCATCGAAAATCCATTGTCCAGCTAAGC 
CCGCGTGCAACCCGTAAGGTTTTCACTGTTGTTGATCTTGCCAGGTTAATTGAGGCAACA 
ACTGATGCTGATCTGCAGGAAGAGCTCAATCTGGCAGGGGATTCCGTGATCGATAGGCTG 
CATGCGACAGTTGAGGCTGCTCGTCTTAGCCGCAGTGAATTG7VATCCTCTGGATAACCTC 
GCAGATGAAGATATTGTTGACCCGTACGGAAAGAGTCAATCGGTTTATGAGGCATCGGCG 
AGTCAGCTAATTCCAGCTATTCGTTTGATTGCTTCTTATTTGAACAAAGCACTGGAGTCT 
GCG 

>RXAO 1 2 7 2 - down s t r e am 
TAATGGCGAGGAAGTATCGGGTG 

>RXA013 68 

AAGCGGATCTGCCAGGGCTGCCCGGTTCGGGATGAATGCCTAGAGTTTGCTCTTGAGCAT 
GATGAACGCTTTGGAATTTGGGGTGGTCTCTCTGAACGTGAGCGCCGCCGCCTGAAACGC 
GAAATTTCG 

>RXAO 13 68 -downstream 
TAAAACTTCAAGACCAGTAAGCG 

>RXA0137 5-upstream 

CAAGCAGCCAATGGCCGCGTATGAAAACCCCCGCTCCATAAATACAGAACACATACAGAA 
CTTGACCGACAATCTAATTACCGCGAAGGGTTAGCAGCAC 

>RXA0137 5 

GTGACTGAAAAGTATCGTCCCGTCCGTGACATTAAGCCTGCTCCGGCAGCAATGCAATCA 
ACTAAACAAGCGGGCCATCCTGTGTTCCGAAGCGTTGTCGCTTTTGTTTCAGTGCTGGTG 
TTGGTGGTATCGGGTTTGGGGTATCTTGCTGTCGGAAAAGTGGATGGTGTCGCTTCTGGC 
AACTTGAACCTTGGTGGCGGTCGCGGCATCCAGGACGGCAATGCTGCTGACGGTGCTACC 
GATATTTTGTTGGTGGGTTCTGATTCCCGTTCCGATGCTCAGGGCAACACGCTGACTGAG 
GAGGAGCTGGCGATGCTCCGCGCAGGCGACGAGGAGAACGACAACACCGATACGATCATG 
GTGATTCGTGTTCCTAACGATGGTTCCTCTGCCACCGCTGTCGCGATTCCTCGCGATACC 
TATATTCATGATGACGATTACGGCAACATGAAGATCAACGGCGTTTACGGTGCGTACAAG 
GATGCCCGTCGCGCTGAGCTCATGGAACAGGGTTTCACCAATGAGTCAGAGCTGGAAACC 
CGGGCGAAGGATGCTGGCCGAGAAGGTTTGATCGATGCTGTGTCAGATCTCACCGGCATC 
ACCGTCGATCACTACGCCGAAGTTGGCCTTTTGGGATTCGTCCTGCTCACCGATGCTGTC 
GGTGGTGTCGAAGTCTGCCTCAACAACGCCGTCGATGAGCCTTTATCCGGCGCCAACTTC 
CCTGCAGGCCGTCAAACCCTCGGTGGCTCCGATGCGTTGTCTTATGTGCGCCAGCGCCAC 
GATCTCCCCCGCGGCGACCTCGACCGCATCGTCCGCCAGCAGTCGTATATGGCATCGCTT 
GTTAATCAGGTGCTGTCTTCTGGAACACTCACCAACCCTGCAAAGCTTTCCGCACTTGCT 
GATGCCGTCACCCGCTCCGTCGTCATCGACGAAGGCTGGGAGATCATGAGCTTTGCCACT 
CAGCTGCAGAACCTCGCGGGCGGCAACGTCACATTTGCCACCATCCCGGTTACCTCTATC 
GACGGCACCGGCGATTACGGCGAGTCCGTTGTCACCATCGATGTCAACCAGGTGCATGCA 
TTCTTCCAAGAAGCACTCGGCGAAGCAGAGCCAGCTCCAGAAGACGGCTCCGACGATCAA 
TCTGCTGATCAGGCCCCTGACCTAAGCGAAGTCGAGGTCCACGTCCTCAACGCTTCCTAC 
GTCGAAGGCCTCGCCAACGGTATCGCCGCGCAACTGCAGGAATTGGGTTACTCCATCGCA 
GAGACCGGCAACGCAGCGGAAGGCCTCTACTACGAGTCCCAGATCCTCGCCGCCGAAGAA 
GACAGCGCCAAGGCCCTCGCGATTTCCGAAGCCCTCGGTGGTCTCCCATCGTGGCCAACT 
CTTCCCTCGACGACAACACCGTCATCGTCGTATCCGCCGGCGATTACGCTGGCCCTACCG 
CGGAAGCAAACGCCG 

>RXA0137 5-downstream 
TGACATCCAGCACCGTCGGCCAG 

>RXA014 18-upstream 
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AACACTCAAATGATCATTTGACTATTAGCGAAAGAAATTATCAATGGAGCATCAACCTCA 
TCTCCTGAAGCTCGCCGATGAGTGGGCGCCAACATTCAAA 

>RXA01418 

ATGCTCGGCGATCGCACGCGCCTCCGCCTCCTCATCGCGCTGCATTATCACGGCCCCGGT 
GAAGCCACCGTCTCAGAACTCGCGGACATCGTCGGCGTCACCCTGCCCACGGCCTCCGCA 
GCGCTCCAACTGCTCGCAGATAACGGAGTGGTCGAGTCCTTCAAGGAGGGGCGGGTGACA 
AGATATAAGCTTGTCGACGCCACGACCCACACCTTGCTTCACCACCTCGGGGGCACCCAC 
CGACAT 

>RXA014 18-downstream 
TAAAGGGAACCAAATAGCGTTCG 

>RXA014 50-upstream 

ATGTGAAGCGAAATTTCACCTTTTTGGAGGCAGTCGACAATGCTCTGGGCACTGTTTTGT 
GCCTCCTGATTAGGTTCTTACCCAACGATTGCTAGGATAT 

>RXA014 50 

GTGCCTGTGACCCTTACTCTTGGAATCGTCGGCCTGCCCAACGTTGGCAAGTCCACCCTG 
TTCAACGCCCTGACTCGCAATGACGTGCTCGCAGCGAACTACCCGTTCGCCACCATCGAG 
CCAAACGTGGGCCTTGTCGAGCTTCCAGACGCTCGCCTTGAACGCCTTTCTGAAATCTTC 
GGCTCTGAGCGCATCCTGCCAGCAACCGTGTCTTTCGTTGACATCGCCGGAATCGTTAAG 
GGAGCTTCCGAAGGCGAAGGAATGGGCAACGCTTTCCTTGCCAACATCCGCGAAGCAGAC 
GCTATCTGTCAGGTTGTGCGCGCATTCGCTGACGAAAACGTCATTCACGTCGATGGTGAA 
GTTAACCCAGCAACCGATATCTCTGTGATCAACACCGAGCTGATCCTCGCCGACCTGCAG 
ACCGTGGAAAAAGCACTCCCACGCCTCGAAAAGGATGCACGCAAAGACAAGGGACTTGGC 
GAAGTCGTAGATGAGACCAAAAAAGCCCTTGCGATCTTGAGCGATGACCGCACCTTGTTT 
CTCTGCAGCAA7\AGCTGGCGACAT 

>RXA014 50-downstream 
TGATCTGGCCCTCCTGCGCGATC 

>RXA014 51-upstream 

GTGGAAAAAGCACTCCCACGCCTCGAAAAGGATGCACGCAAAGACAAGGGACTTGGCGAA 
GTCGTAGATGAGACCAAAAAAGCCCTTGCGATCTTGAGCG 

>RXA014 51 

ATGACCGCACCTTGTTTCTCTGCAGCAAAAGCTGGCGACATTGATCTGGCCCTCCTGCGC 
GATCTCCACCTGATGACAGCAAAGCCTTTCCTCTACGTCTTCAACTCCGACGAAAAAGTG 
CTCACCGACGACGCCAAGAAGGACGAACTCCGCGCACTAGTCGCGCCAGCAGACTGCGTA 
TTCCTTGACGCACAAACTGAAACCGAACTTCTTGAACTCGAAGAAGACGAAGCAGCAGAA 
CTCCTCGAAGCTGTAGGCCAAACGGAACCAGGCCTACACTCCCTCGCACGTGCAGGATTT 
GAAACCCTCGGACTACAGACCTACCTCACCGCGGGTCCTAAGGAATCACGCGCCTGGACC 
ATCCACAAGGGCGACACCGCTCCACAGGCAGCAGGCGTTATCCATTCTGACTTCGAACGC 
GGCTTCATCAAGGCTGAAATCGTCTCCTTCGAAGATCTTGACGCTGCTGGTTCCATGGCG 
GAAGCCAAGGCCCAGGGCAAAGTCCGCCAAGAAGGTAAGGACTACGTGATGGTCGATGGC 
GACGTTGTGGAGTTCCGGTTTAACGTC 

>RXA01451-downstream 
TAGCGTTATTGACGCTCCTCGTT 

>EOCA01500-upstream 

ACACCCTCACCTGGAATTTTGCAGAAATAGGGCAAGAATCAAAATAGTGGGT^AATCCCCA 
TGTTTCGCGAAGTCCCATTGTTGGAGTTAGGCTTATACCC 

>RXA01500 

ATGGCTACGCATCCAGATATTCCCACAGAGTTGCTTGAATCTCCGAGCTATCAACTTGAA 
CGACTTCGACGACGCACTCGTGACCATGTTGAGGCCGAATTGGCCAAGCATGAGACCACG 
ATGAGGGAATTCTGGACGCTTACATGTCTGGTTCATTCCGACGCTGCAAGCCAGTCAGTT 
CTGTGTGAGCTGCTGGCCATTGATGCATCGGATATGGTCAGACTCGTTGACTCACTTGAG 
GTACGCGGCTGGGCGAAAAGGGAACGTGATCCCAAAGACCGTCGTCGCCAAATTGTTGCG 
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TCAACGAAGAAGGGAAAAAACGCCCAGGCGGATCTGCACAAAGTTGTGCTTGAGGCAGAG 
GATGCTGCGTTGGATGAGTCTACGTCCAAGCAGTTGAAGCACCTTCGTAAATTGGCCGCA 
GCAATTATCTCCACCGAAGAGGAC 

>RXA01500-downstream 
TAAATATAACGTGGCATTGAGCA 

>RXAO 1537 -upstream 

CCTATAGCACAATCCCGCGTGGGGATTAATGGTCATCGAATATGGTCAGAGCACCCCCGC 
AAGTGCCTAGGTGTGTCCCCAGTCCTTTAAGATCAGTCAC 

>RXA01537 

ATGACGCAGGCAATAGCAGCATCCCTTGATTTAGCGGCTCGAATCACCGCCAAAATTGAT 
CAAGGAGTGCTCACTCCAGGTACTCGACTACCCGAGGTTGCTTTGGCAGAAGAACTTGGC 
GTTTCACGGAACACGCTGCGTGAAGCTTTTCGGGTACTCATGCAAGACGGACTGGTGGAT 
CATATTCCCAACCGTGGGGTTTTCGTGCACACGTTCACCAAGTCGGATGTGGAAGATATC 
TATGCTTACCGCACATTTATCGAGGTTGCTGCGATTAGGTCGGCGCGGAAAAATCCTCAG 
TTGCTGGAGCAGTCTTTGGGGGTAATGCGAGAGGCCTACGAAAGGGGTGCTGCAGCCAAT 
GCCGTGGGTGATTGGCAAACTGTCGGTTCTGCCAACAGTGCTTTTCACTTGGCGATTGTG 
GACCTAGCAGGAGTGGCTAGGTTGTCAGCAGACGCTCGAAAAGTGTTGGCGCTGGCTCGC 
ATCGGATTCATGGCCACCTACAACGTGGAGACATTCCATAGCATTTACGTGGAAAAGAAC 
CATCAAATCTTGAAGTATTTGGCTGCCGGTGAATTCGAAGAGGCGGAACAATACCTCCAG 
AAATACTTCGAAGATTCGCGCGATGATTTGTCTGCGCACCTACCGGAATTT 

>RXAO 15 37 -downstream 
TGAAATTGCGATAGCATAGATGA 

>RXA0157 3-upstream 

AGCCCCAGCTCACCGAATTCTCCATTCGTTTTAATTGCTTCGTTAATTAAAACGCCATAT 
AAAAACCGGCGCATTGCCGGTATTTTTCCAGGAGAATTTA 

>RXA0157 3 

ATGAAGAGGCTTTCCCGTGCAGCCCTCGCAGTGGTCGCCACCACCGCAGTTAGCTTCAGC 
GCACTCGCAGTTCCAGCTTTCGCAGACGAAGCAAGCAATGTTGAGCTCAACATCCTCGGT 
GTCACCGACTTCCACGGACACATCGAGCAGAAGGCTGTTAAAGATGATAAGGGAGTAATC 
ACCGGTTACTCAGAAATGGGTGCCAGTGGCGTTGCCTGCTACGTCGACGCTGAACGCGCG 
GACAACCCAAACACCCGCTTCATCACCGTTGGTGACAACATTGGTGGATCCCCATTCGTG 
TCCTCCATCCTGAAGGATGAGCCAACCTTGCAAGCCCTCAGCGCCATCGGTGTTGACGCA 
TCCGCACTGGGCAATCACGAATTCGACCAGGGCTACTCAGACCTGGTGAACCGCGTTTCC 
CTCGACGGCTCCGGCAGCGCAAAGTTCCCATACCTCGGCGCAAACGTTGAAGGTGGCACC 
CCAGCACCTGCAAAGTCTGAAATCATCGAGATGGACGGCGTCAAGATCGCTTACGTCGGC 
GCAGTAACCGAGGAGACCGCAACCTTGGTCTCCCCAGCAGGCATCGAAGGCATCACCTTC 
ACCGGCGACATCGACGCTATCAACGCAGAAGCAGATCGCGTCATTGAGGCAGGCGAAGCA 
GACGTAGTCATCGCATTGATCCACGCTGAAGCCGCTCCAACCGATCTATTCTCCAACAAC 
GTTGACGTTGTATTCTCCGGACACACCCACTTCGACTACGTTGCTGAAGGCGAAGCACGT 
GGCGACAAGCAGCCACTCGTTGTCATCCAGGGCCACGAATACGGCAAGGTCATCTCCGAC 
GTGGAGATCTCCTACGACCGCGAAGCAGGCAAGATCACCAACATTGAGGCGAAGAATGTC 
TCTGCTACTGACGTTGTGGAAAACTGTGAGACTCCAAACACAGCAGTCGACGCAATCGTT 
GCAGCTGCTGTTGAGGCCGCTGAAGAAGCAGGTAATGAAGTTGTTGCAACCATTGACAAC 
GGCTTCTACCGTGGGGCGGATGAAGAGGGTACGACCGGCTCCT^CCGTGGTGTTGAGTCT 
TCCCTGAGCAACCTCATCGCAGAAGCTGGACTGTGGGCAGTCAACGACGCGACCATCCTG 
AACGCTGACATCGGCATCATGAACGCAGGCGGCGTGCGTGCGGACCTCGAAGCAGGCGAA 
GTTACCTTCGCAGATGCATACGCAACCCAGAACTTCTCCAACACCTACGGCGTACGTGAA 
GTGTCTGGTGCGCAGTTCAAAGAAGCACTGGAACAGCAGTGGAAGGAAACCGGCGACCGC 
CCACGTCTGGCATTGGGACTGTCCAGCAACGTCCAGTACTCCTACGACGAGACCCGCGAA 
TACGGCGACCGCATCACCCACATCACCTTCAACGGTGAGCCAATGGATATGAAGGAGACC 
TACCGCGTCACAGGATCATCCTTCCTGCTCGCAGGTGGCGACTCCTTCACTGCATTCGCT 
GAAGGCGGCCCAATCGCTGAAACCGGCATGGTTGACATTGACCTGTTCAACAACTACATC 
GCAGCTCACCCAGATGCACCAATTCGTGCAAATCAGAGCTCAGTAGGCATCGCCCTTTCC 
GGCCCGGCAGTTGCAGAAGACGGAACTTTGGTCCCTGGTGAAGAGCTGACCGTCGATCTT 
TCTTCCCTCTCCTACACCGGACCTGAAGCTAAGCCAACCACCGTTGAGGTGACCGTTGGT 
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ACTGAGAAGAAGACTGCGGACGTCGATAACACCATCGTTCCTCAGTTTGACAGCACCGGC 
AAGGCAACTGTCACCCTGACTGTTCCTGAGGGAGCTACCTCTGTCAAGATCGCAACTGAC 
AATGGCACTACCTTTGAACTGCCAGTAACCGTAAACGGTGAAGGCAACAATGATGACGAT 
GATGATAAGGAGCAGCAGTCCTCCGGATCCTCCGACGCCGGTTCCCTTGTAGCAGTTCTC 
GGTGTTCTTGGAGCACTCGGTGGCCTGGTGGCGTTCTTCCTGAACTCTGCGCAGGGCGCA 
CCATTCTTGGCTCAGCTTCAGGCTATGTTTGCGCAGTTCATG 

>RXA0157 3-ciownstream 
TAATAACTTGTAGTAAATAAATC 

>RXA01655-upstream 

GTTTCGGTCATGACAGATGATCCAACGCCACAAAGTGGACTAGCGGTAGATCCACTTTCA 
GTCACTTGCATTAGACCACTTTTTTGAGGACGATGAAGCC 

>RXA01655 

ATGCTTGCCGACCTTCCCATCGCCTTAAACCCACACGAACCAACATCCATCCCCACGCAG 
CTCACAGAACAGATCCGTCGTCTCGTGGCGAGGGGAATTCTCACCCCAGGAGACCCGCTT 
CCCAGCAGTCGCTCACTATCCACCCAATTGGGGGTATCCCGCGGCAGTGTGGTGACCGCT 
TATGACCAATTGGCCGGTGAAGGCTACCTCAGCACCGCCCGCGGTTCCGGTACAACGATC 
AACCCAGATCTGCATTTGTTGAAGCCTGTGGAAATTGAGAAGAAGGAGACGTCGAGAAGC 
GTCCCGCCCCCGCTGCTCAACCTGAGCCCCGGCGTGCCCGATACCGCGACGCTCGCCGAT 
TCCGCATGGCGCGCTGCGTGGCGCGAAGCCTGCGCCAAGCCACCCACGCACTCCCCTGAG 
CAGGGACTTTTGAGGCTGCGGATCGAGATCGCCGACCACCTGCGCCAGATGCGTGGCCTC 
ATGGTCGAGCCGGAGCAGATCATCGTCACCGCCGGCGCGCGCGAGGGGCTGAGTCTGCTG 
CTGCGCACCATGGATGCGCCTGCCCGCATCGGCGTCGAATCGCCCGGCTACCCCAGCCTG 
CGCCGCATCCCGCAGGTGCTTGGCCATGAGACGATCGATGTGCCGACCGACGAATCCGGC 
CTCGTACCCCGCGCGCTGCCCCACGACCTTAACGCGCTACTGGTAACCCCTAGCCATCAA 
TATCCCTACGGCGGCTCGCTGCCCGCCGATCGCCGCACCGCGCTAGTCGCGTGGGCTGAG 
GCAAACGATGCGTTGCTTATTGAAGACGACTTCGATTCTGAGCTGCGCTACGTCGGTATG 
CCGCTTCCGCCGCTGCGTGCGCTGGCGCCCGATCGCACGATTCTGCTCGGCACGTTTTCC 
TCCGTGATCACACCACAAGTCGCCTGCGGATACCTCATCGCGCCGACGCCCCAGGCGCGC 
GTGCTCGCCACGCTTCGCGGGATTCTCGGCCAGCCAGTCGGCGCCATCACCCAACACGCG 
CTCGCGTCCTACCTCGCCTCAGGCGCTTTACGACGCCGCACCCAACGTTTGCGGCGCCTT 
TACCGACACCGCCGCTCCATCGTCCAAGACACCCTCGGTGACCTCCCGAATACGCAGCTT 
CGCCCCATCAACGGTGGCCTCCACGCAGTTCTCCTTTGCGACAAACCCCAAGACCTCGTT 
GTCACCACACTCGCCTCCCGAGGCCTTAACGTCACCGCGCTTTCCCACTACTGGGGCGGC 
ACCGGCGCAGACAACGGCATCGTCTTCGGCTTCGGCTCCCACGACGAAGACACCCTCAGA 
TGGGTGCTTGCTGAGATCAGCGATGCGGTGTCTCTAGGC 

>RXA01655-downstream 
TAAAGAAAAAACAGCCCGAGAGG 

>RXAO 168 7 -upstream 

TTAGAGATCAACAAATAGACTGAACGGTCTATTTGTTCACGTTTTCGCAGCTAAATTCTT 
GAATCTAAACTATTCCCAAATAGACCATACGGTCTAACAT 

>RXA01687 

GTGTTCATGCTTGCACAGCGAACACTCCCCATTCACATCACCGCCCCCCACCTACCCGTC 
GCGCGCGTATTTCATCAAATTCGCGCCACAGACGCCGATCGCACCTCTCTGCAACGCGAT 
CTTGAACTCTCCCAAGCTGGCATCACGCGGCATGTGTCAGCGCTTATTGATGCAGGTCTC 
GTGGAGGAAACCCGAGTGGATTCCGGGGCGCGCTCGGGGCGACCGCGCACAAAATTAGGC 
ATCGACGGCCGCCATCTCACCGCCTGGGGAGTGCACATTGGCCTGCGCAGCACGGATTTT 
GCGGTGTGCGATTTAGCCGGCCGAGTGATTAGGTATGAGCGCGTGGACCATGAAGTTTCA 
CACTCCACGCCGTCGGAAACGCTGAATTTTGTCGCACATAGGTTACAAACATTGAGCGCC 
GGCTTGCCCGAGCCCCGCAATGTGGGCGTGGCATTATCTGCCCACTTAAGCGCCAACGGA 
ACCGTCACTTCCGAAGATTATGGCTGGTCAGAGGTGGAAATTGGGGCACACCTCCCCTTC 
CCCGCCACCATCGGATCAGGTGTTGCGGCGATGGCCGGTTCGGAAATTATCAACGCGCCA 
CTGACCCAATCCACGCAGTCCACGCTGTATTTCTACGCCCGCGAAATGGTCTCCCACGCC 
TGGATTTTCAACGGCGCTGTCCACCGCCCCAACAGCGGCCGCACGCCGACGGCGTTCGGA 
AATACAAATACCTTAAAAGATGCTTTTCGACGTGGACTCACACCAACAACTTTCTCCGAT 
TTAGTCCAACTCTCCCACACCAACCCGCTTGCACGACAGATCCTCAACGAGCGCGCCCAC 
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AAACTTGCCGACGCCGTAACCACCGCCGTTGATGTTGTCGACCCCGAAGCCGTCGTCTTC 
GCCGGCGAAGCCTTCACCCTGGATCCGGAAACTCTTCGCATTGTGGTGACCCAGCTCCGA 
GCAAACACCGGCAGCCAACTGAGAATCCAACGCGCAGACGCCTACATTCTCCGCACCGCG 
GCCATCCAGGTGGCGCTGCATCCGATCCGTCAAGATCCGTTGGCATTTGTG 

>RXA0168 7-downstream 
TAATTACCACCCATGTTGCGGGG 

>RXA017 5 9-upstream 

CACTAAAGAACTTCTGAGCGCGCTATCGTTGGTCGATGCTATTGGTCTGGGTACTTCTCC 
GGTAGACCATCACTCTGAATAAGGGGGATAACATATAGTT 

>RXA017 5 9 

ATGACCAAACGGCTCAGCCTTGAAGGGCTCCGCTATGCGCAGGCCGTCGCAG/^iLACTCAC 
TCATTCAGCGCAGCAGCCCGCGAATACGGAGTCACCCAACCTGCGCTATCCAACGGCATC 
GCCAAACTGGAAGATCGGCTCGGTGAACAACTCTTCGATCGATCTACTCAAGGCGTCACC 
CCGACGTCCTTTGGCCTCCACATCCTCCCCCTGATCCAACGCGCGCTGACTGAAATCGAC 
GCAATCACCGCGGAAGCGCACCGTTTGATTAACTCAGAAGCACGCAGCATTCGAGTTGGA 
ATCTCCCCACTTATCAACCCTCAACTGGTTGCACGAACATATACCGCGGTTCGTGAGCTT 
CCCACAGCACACGACCTAGTACTCCGCGAAGCAAACATGAAAGAACTACATGAAGGACTT 
CTTGCAGGTGAACTTAATGTAATTCTCATTCCCGCAGTGAAACCACTACCCCATTTTGAA 
CACCGCATCATTGACTCCGAACCAGTCGTTATCGTCGAATCCACCCAGGACAGCACCGAC 
CCCATAGAACTTCGCGAGACTCAGCACGAACCGTTCATTCTGGTACCCGACACATGCGGT 
TTAACCACTTTCACCAATCAACTGTTTGAAACAAATGACCTGGCATTAAACGCCTATTCC 
GGCGAAGCAGCCAGCTACCAAGTACTCGAACAGTGGGCCACACTTGGACTCGGATCTGCA 
ATGCTTCCACTTTCTAAACTCAGCTCCCCTACAGCACCCCAT 

>RXA017 5 9-downstream 
TGACCACTCCGCGAACAAGGCCT 

>RXA017 63-upstream 

ATTTTGAAAACTTACATCGTTCGCTTGACGCACAGAATGCATGCATGTTCAAATGATTGA 
AGATCGAAACTATTTTTCAGCCAGTTCACATGGAGCCACT 

>RXA017 63 

ATGACCACCAGCAACCCCACCGCCGAGATCATTGGCGGACCAGAACGATTCCTCGAGGCC 
GAATTGTCCCAGCAGATTCAATTCCTCACTGCCCGCGCACGAGCCAAGGGATCCGCCAAA 
GGAAACGAAGCCTTAGTCGACCTCGGACTTAAAGTTCGCCAATACTCCACACTGTCCCTA 
GCGGCCAGCGGATTAAAACCAACCCAACGAGAATTGGGAGCATTTCTCGACCTAGACCCA 
AGTCAGATTGTTGCCTTGGTCGATTTCCTAGAAAAGCGCGGATTAGTGGCCCGGGAAGTT 
GACCCCCGGGATAGGCGCTCGAAGATCATCATCGCCACCGAAAAAGGTCTGGAAATTCAC 
GACGAAGCCACCAAACGCCTCCTCATCGCCGAGGGTGAATCTCTAAAAAACCTCACCTCC 
GACGAGCAAGAACAACTAAGGGAACTGCTGCTCAAAATCGCCTTT 

>RXA017 63-downstream 
TAAGTCTCTTAACCACGCCGGCC 

>RXA018 2 6-upstream 

CTACACCTAGGACGAGTGCTAGCGTTCCAGTTGAGACTAATGCACCGGCTGATGATTTAA 
TCGACGCCGTAAATGGCCTATTGGATGTAGGAGGAGCGCA 

>RXA01826 

GTGACCTTCGTGATCGCTGATCGCTATGAACTGGATGCCGTCATCGGCTCCGGTGGCATG 
AGCGAGGTGTTCGCGGCCACCGACACGCTCATTGGTCGGGAGGTCGCGGTAAAGATGCTG 
CGCATCGACCTTGCGAAAGATCCCAATTTCCGAGAACGCTTCCGCAGGGAAGCCCAAAAC 
TCCGGAAGGTTGAGCCACTCTTCGATCGTCGCTGTTTTTGACACCGGCGAAGTAGACT^AA 
GACGGCACCTCTGTTCCCTACATTGTGATGGAACGCGTGCAGGGTCGAAACCTGCGCGAA 
GTTGTCACCGAAGACGGCGTATTCACCCCAGTTGAGGCAGCCAACATCCTCATCCCTGTG 
TGTGAAGCGCTGCAGGCATCCCATGACGCCGGCATTATTCACCGCGATGTGAAACCCGCC 
AACATCATGATCACCAACACCGGTGGCGTGAAAGTCATGGACTTCGGCATCGCCCGCGCG 
GTCAACGATTCCACCTCCGCCATGACTCAAACCTCCGCAGTCATCGGCACCGCCCAGTAC 
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CTCTCCCCTGAGCAGGCCCGCGGCAAACCCGCCGATGCGCGTTCCGATATTTACGCCACC 
GGCTGCGTCATGTACGAATTAGTCACCGGTAAGCCACCTTTTGAAGGCGAGTCCCCTTTC 
GCCGTGGCCTACCAACACGTCCAGGAAGACCCCACCCCTCCTTCGGATTTCATCGCGGAC 
CTCACCCCGACCTCTGCTGTCAACGTGGATGCCGTGGTACTCACCGCCATGGCAAAACAC 
CCCGCCGACCGCTACCAAACAGCCTCCGAAATGGCCGCTGACCTGGGCCGGCTATCCCGC 
AATGCAGTCTCCCATGCCGCACGCGCGCATGTAGT^AACAGAAGAAACCCCAGAAGAGCCC 
GAAACTCGCTTCTCGACGCGCACCTCCACCCAAGTGGCCCCCGCCGCAGGCGTGGCTGCG 
GCCAGTACGGGGTCAGGGTCTTCTTCGCGTAAACGTGGATCCAGAGGCCTCACCGCCCTG 
GCCATCGTGTTATCCCTAGGTGTCGTCGGCGTTGCCGGTGCCTTCACCTACGACTACTTT 
GCCAACAGCTCCTCCACTGCAACCAGCGCGATCCCCAATGTGGAAGGCCTCCCGCAGCAA 
GAAGCTCTCACAGAACTTCAAGCAGCAGGATTTGTTGTCAACATCGTCGAAGAAGCCAGC 
GCCGACGTCGCCGAAGGCCTCGTCATCCGAGCAAACCCAAGCGTTGGATCCGAAATCCGC 
CAAGGGGCCACCGTCACCATCACCGTGTCCACCGGCCGAGAAATGATCAACATCCCAGAC 
GTCTCCGGCATGACACTTGAGGACGCCGCCCGCGCCCTCGAAGACGTTGGTCTCATACTC 
AACCAAAACGTTCGGGAAGAAACCTCCGACGACGTCGAATCTGGCCTCGTCATCGACCAA 
AACCCCGAAGCCGGCCAAGAAGTAGTCGTGGGTTCCTCTGTATCTCTAACCATGTCTTCA 
GGCACCGAGAGCATCCGAGTGCCCAACCTCACCGGCATGAACTGGTCACAAGCAGAACAA 
AACCTCATCTCCATGGGCTTTAACCCCACAGCTTCCTACTTAGACAGCAGCGAACCAGAA 
GGCGAAGTCCTCTCAGTTTCCAGCCAAGGAACTGAACTACCCAAGGGTTCATCCATCACA 
GTGGAAGTCTCCAACGGCATGCTCATCCAAGCCCCCGATCTCGCCCGCATGTCCACCGAA 
CAGGCCATCAGTGCCCTCCGCGCTGCTGGCTGGACCGCCCCAGATCAATCCCTGATCGTC 
GGCGACCCCATCCACACCGCAGCCCTCGTGGATCAAAACAAAATCGGATTCCAATCCCCA 
ACCCCTGCAACCCTCTTCCGCAAAGACGCCCAAGTGCAAGTGCGACTCTTCGAATTCGAT 
CTCGCTGCACTCGTGCAA 

>RXA0182 6-downstreain 
TAGCCAACAAGGAAACCGTCAAG 



>RXA018 27-upstream 

GGTGAAAGACGGCGGTGGATTTGGCACCAGTGCAACTGGTGGTCAGGTCGCAGCCCCAAT 
TGGCCGAGCTGTGCTTCAGGCAGCCGGAGGATTTTAAAAT 

>RXA018 2 7 

ATGAGTCAAGAAGACATCACTGGAAAAGATCGACTCCAAGAACTCATCGGCGCTGATTAT 
CGTCTGCAGTGGATCATCGGACACGGTGGCATGTCCACCGTATGGCTCGCAGATGATGTG 
GTCAATGATCGCGAAGTAGCCATCAAGGTACTGCGCCCGGAATTTTCCGACAACCAGGAG 
TTCTTGAACCGTTTCCGCAATGAAGCGCAAGCGGCTGAGAATATCGATTCTGAACACGTG 
GTGGCCACCTATGACTACCGTGAGGTTCCAGACCCTGCTGGGCATACTTTCTGCTTCATC 
GTCATGGAATTTGTCCGCGGTGAATCGCTTGCGGATCTTCTAGAGCGCGAAGGCAGACTG 
CCGGAAGACCTGGCTCTTGATGTGATGGAACAGGCGGCACATGGTTTGTCGGTGATTCAC 
CGGATGGACATGGTGCACCGCGATATCAAGCCGGGCAACATGCTGATCACAGCCAATGGC 
ATTGTGAAGATCACGGACTTTGGTATCGCTAAGGCTGCCGCTGCTGTGCCTTTGACCCGC 
ACCGGCATGGTGGTGGGTACTGCTCAATATGTTTCACCTGAGCAAGCCCAGGGCAAGGAA 
GTCACCGCGGCTTCTGATATTTATTCTCTCGGTGTGGTCGGCTATGAGATGATGGCTGGC 
CGCCGCCCGTTCACTGGAGATTCTTCGGTGTCTGTGGCGATCGCGCACATCAACCAAGCT 
CCGCCGCAGATGCCCACCAGCATTTCGGCACAGACTCGCGAGTTGATTGGCATTGCGTTG 
CGCAAGGATCCGGGTCGCCGTTTCCCTGATGGAAATGAAATGGCGCTAGCTGTTTCTGCT 
GTGCGCCTTGGCAAGCGCCCGCCTCAACCGCGCACGAGCGCGATGATGGCGCAGGCGGAG 
GCGCCGTCGCCAAGCGAATCAACGGCGATGCTGGGCAGGGTGGCCCGGCCTGCAACAATC 
ACCCAAGAAGCGGCCCCGAAACGCGGTTCCGGCATTGGCATTGGTCTGTTCATCGCAGCT 
TTGCTTGCCGTGATTATTGGCGCGGTGATCTATGCGGGCACCACCGGAATTTTGTTCAAC 
GACACTCCGGAAGA7\ACCACCACACCTGAAACCATTACGG7\AACATACACCCCAACCGTG 
GAGGAAACCACCTCTCAGTGGGTACCGCCAACGCCTCCAACACGGTCAACATTCACCGAA 
CCTGAAACAACTTCACACCGTCCGACGACAAGTGAAGAGAGCACATCCGAGGAACCAACC 
ACGGAAGCTCCAACAAGTAGCCGAACTGTGCCTCAAATCCCTACCTCTACACCTAGGACG 
AGTGCTAGCGTTCCAGTTGAGACTAATGCACCGGCTGATGATTTAATCGACGCCGTAAAT 
GGCCTATTGGATGTAGGAGGAGCGCAG 

>RXAO 18 27 -downstream 
TGACCTTCGTGATCGCTGATCGC 
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>RXA018 30-upstream 

ACGGCACTTTTGTCGGTGGTACGCGCATTGATCAGCCTGAGCAGATTGCGGTGGGCACGG 
ATATCCGTATTGGTCGTACAGCAGTGAGGCTTGTTCCCTG 

>RXA01830 

ATGTTGAAACTTAAATATGCGGTGGCATCTGACCGAGGGTTAGTGCGCGGGAACAATGAG 
GATTCCGCTTACGCTGGCCCGCATTTGTTGGCGCTGGCTGATGGTATGGGCGGCCATGCT 
GCTGGTGAGATCGCTTCCCAAACCATGATCAACCATTTGCGTGCGCTTGATGTTGATCCT 
GGTGATAACGATATGTTGGCGCTGGTGGGCATGGTGGCAGGCGAAGCCAACGCGGCGATT 
GCTGAGGGCATCGCCGAAGACCCGGCGCGCGACGGCATGGGCACTACGTTGACGGCGTTC 
ATGTTTAACGGGCGTGACCTGGCAATGTGCCACGTCGGCGATAGTCGTGGTTATGTGCTT 
CGCGACGATAAGTTGGTACAGGTTACAGTCGACGATACTTTTGTGCAGTCGTTGGTCGCT 
GAGGGCAAGCTTGATCCAGAAGATGTTTCAACTCACCCTCAGCGTTCTTTGATTCTGAAG 
GCTTACACCGGCCATCCTGTGGAGCCCACTCTGGAGCAATTCCCGGCCTTGCCTGGGGAT 
CGTTTGTTGTTGTGCTCTGATGGTCTATCAGATCCGGTTACACACTCCACGATTGAAGAA 
ACAGTGCGTGTAGGCACCCCGCAGGATGCGTCCACCAAGTTGGTGGAGTTGGCGCTGCGT 
TCTGGCGGTCCGGACAATGTGACGGTCATTGTGGCCGATGTTGTAGAAGTCACCGAGGCG 
GAAGCAGCAGCGGAAGCATCAGTGCCTGTCACGGCTGGTGCGCTCAATGGTGAGCAGCCT 
GAAGATCCGCGGCCTGATACCGCTGCGGGACGCGCTGCGGCGATCACACGGCGAGCTCAA 
GTGATTGATCCGGCACCAAAGATATCTGATGCTGGAACGGAGGATATTCCCACAATTGAG 
GAGCCACCAGAGAAAAGTTCCAGCAAACTTGCGGTATTGATCGTAGCCCTGGTCATCCTC 
ATCGGTGTAGTTGCCGCAGGATGGTGGGGCTACTCCCGTATTGACAGCACTTTTTACGTC 
GCGGTCAATGATGAGGAAGCCATCACCGTGGAACACGGTGTGGATTACCGCATCTTTGGC 
AAGGATTTACATTCGCAATTCCAGGTGGCGTGCCTGAATGAAGCTGGCACCTTGTCACTC 
AAGGAATCCTGTGAAAACGGTACGTCTTTCAAATTGGATGATTTACCGGCATCTGTTCGC 
GGTAGTGTCGCAGGATTACCGTCTGGGTCGTATGACGAGGTCCAGGCGCAAATGCAACGG 
CTGGCTGCTCAAGCTTTGCCAGTGTGCGTGAACTTAGAAGTAACAACCGGTGGCGATAGA 
AACGAACCCGGAGTCAATTGTAGGGAGGTCTCA 

>RXAO 1 8 3 0-downs t ream 
TGAACACGCTTGAACGATTAAAG 

>RXAO 183 6-upstr earn 

AGAGGCATAAATGTCACCTCCCGCCCAAAATCTTTTTATACCCCCACACACAGTGAATCC 
CTTCACCACGTCTCATTGGGTGAAATGCTAAATTCAAGGT 

>RXA01836 

ATGGGACAGCAAGAAATTATCGAGGACTCCACCGAGAGCGGTATTAAGGTTTTAGACCGC 
ACTGTATTAATCCTCAATGTCATCGCAGAACAGCCTCGATCGTTGGCAGAGCTCGCAGCT 
GCCACCGATCTGCCCAGGGCTACAGCCCACCGCCTCGCCTCAGCGCTTGAGGTACACGGC 
ATGTTGGCACGCTCCCGCGATAATAGATGGACCATCGGCGCACGGCTTGCCTCATTGGGT 
GCACGCGGCGCTGACACCCTCATCGATACGGCCGTACCAATTATGGCCGACCTTATGGAG 
CGCACCGGCGAATCCGTTCAGCTTTATCGCCTCACCGGCACCACCCGCACGTGTGTGGCC 
AGCCAAGAGCCCAGCTCCGGGCTAAAAAACGTGGTTCCCGTGGGCACTCGCATGCCTTTA 
AATGCAGGGTCAGCAGCGCGCGTTTTTGCCGCCTACCTCCCCATCCCCTCTGCCAGCGTC 
TTTTCCCGCGAGGAGCTTGACCAGGTGCGCGCCAGCGGCTTAGCGGAGTCCGTGGGCGAG 
CGTGAGCTCGGCCTTGCTAGCCTCTCCTCCCCTGTTTTTGATTCCAACGGATCCATGATC 
GCGGCACTGTCCATCTCCGGCGTGGCCGAGCGCCTCAAGCCCCACCCCGCCGCCATGTGG 
GGCACCGAGCTTATCGACGCCGCCGAGCGCCTAGGCGCTTTGCTT 

>RXAO 183 6-downst ream 
TAAGAGCTTTTCGACGCACAACC 

>RXA0184 0 

ATCTCTGAAGAAGACGGCGCCAGCGAACCCGCCACCTTCGCCGAACGCTCCCAACGCCTC 
ATCCAGCAGGAATGCGTTGCAGCCGTGTTTGGTGGATGGACCTCCGCCTCCCGCAAAGCA 
ATGCTCCCCGTCTTTGAGGGCAATAACTCCCTGCTGTTCTACCCGGTGCAGTACGAGGGC 
ATGGAATCCTCGCCGAATATTTTCTACACCGGCGCCACCACCAACCAGCAGATCATCCCG 
GCTCTTGATTACCTGCGTGAAAACGGCCTGAACCGCCTTTTCCTTGTCGGTTCCGATTAT 
GTTTTCCCACGCACTGCAAATTCCATCATCAAGGACTACGCCGAAGCCAATGGTATGGAA 
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ATCGTCGGCGAAGACTACGCGCCGTTGGGATCCACCGACTTCACCACCATCGCCAACCGC 
ATGCGTGACTCCAACGCAGATGCCGTGTTCAACACTTTGAATGGCGATTCCAACGTGGCG 
TTCTTCCGCCAGTACAACAGCCTCGGCTTCAATGCAGACACCCTTCCGGTGATGTCAGTA 
TCCATTGCGGAAGAAGAAGTCGGAGGCATCGGCACCGCAAATATTGAGGGCCAGCTGGTG 
GCGTGGGACTACTACCAAACCATCGACACCCCAGAAAACGAGACCTTCGTGGAG 

>RXA018 60-upstream 

AGTTGGTCAATCAGCACTTCAAAGGCCGAGGAAAAGTGTTATCGCGGGTGTCAAGTAAAG 
TATCGGCGGGACGCTCTTTTCTGATGCTCGTGCCCAAGGT 

>RXA018 60 

GTGAATCCATTCATTCTTGCTGATCAGCTGCTTTACGATGCTAAGCACGCAGGTAGAAAT 
CGGGTTGCGGTGCGCAGAGCTGAAAACACCATTGTCCGCTCAGCTAAGCCCGCATTCTCA 
GTTGAGGAACTTTCGGAGATCCTGGAGTCACATTCTATTCGCCTCGAGCTGCAGCCGATC 
CTAGAACTTGAAACAGGTCGGGTGGGTGCAGCCGAAGGTCTGCTCCGAATCAACTTGGAT 
GGCACCGATGTTCCTACGGGGCAGTTTGTTCAGTCGGTTGAACAGGCCGGGCTAGCCCCG 
AAGCTTGATATCGCAGTCATGAGAGAAGGAATTAATCATATTGAGAGGCTGAGAGCTGTG 
TGTCCGACTTTCAGCCTCGCTTTGAATCTGTCGGGCTATTCTCTGAGCTCGGCGAAAATA 
CGGGAGGAACTAAGAGCCGAATTTAGAGCTCGCGATCTGCCAAGGGGATCAATTAGGTTT 
GAGATTACTGAGACCGCTCCGATTGAAGACATTGACGCGGCAAAAGAGTTTGTGCAGATG 
TTGAAAGATTTTGGCTTCCACATCGTAATCGATGACTTTGGCGCAGGACATGAGCCTTAT 
CAATATCTAAAGAAGTTCGACTTTAGCGTGCTGAAGATTGCAGGTGAATTCATAGAAGGT 
ATGGTCACCAACCGCGTGGACCGAAGCATCGTCGAATCTATTGCTCAACTTGCTAAGGAT 
GAGGAGATGGAAACTGTCGCCGAGTTTGTTTCAAGCAAGGAGATTTTGGAGGCGGTACGA 
GAGATAGGCGTAACGTACGCCCAGGGTTTCCATATTGGTAAATCTAAGCCGATTGATGAA 
TTTATAGCTACTTATCTCGAGACGAACCAAACCGCTACCTGGGGG 

>RXA018 60-downstream 
TAGGAAGAATATGAAAAAGAAGA 

>RXA018 61-upstream 

TCTTTGCGTTCGCGTATTTGATTGCTGTAATTGGCTCATACTCACTGGTATACGGCGATT 
CGTTAGTCGCAATTGTGTGGCCATCAGTGGGAATCGCTGT 

>RXA018 61 

GTGGTGGCCCGTGACCTGCAGAAGCTGGAAAAACTTCGCCTGATTTGTGGATACGTGTTT 
CTAGTCCCAGCCATATACCTGCACTTTTTTGCGGAAACCTCCCTCAGGGGAGTGATTCTG 
GCAGGAATTGCGCACGCTATCGCAGGTCCTGGCGTTGCACTGGTTATGGCATTCATGG7VA 
AATGCGCAATTGCCAGAACTGTTGCGTAAACGGCATGCATTCGCACCCTTCTCCCATATT 
CGCCTTCCAGGCGATGTATTCCGGCTCCTCGTCGCGGGCATTGTCATGGTCGCAATATCC 
AAATTGATTGTGATTCTTGCTTATGCACTGGCAGATTTGCCGTATTCATTCACCCTTTAT 
CTGACGATGGCCCTTCGTGACTTGACTGGCATTATTGTGGTTGCCGGGCCCGGAATTGCA 
CTTTCGACGCCGCTGGTACTAAATATTCACCGATCAGCATGGCGCGAGTTCGCAGTTGTT 
ATCATAGCTACGGTCGGAGTGCTGGCGCTCATTTTCGGATTTGCTGTGGATCTTCCGACG 
GTCTACTTGGCAATGTTGCCATTGTATTGGAGTGCAACCCGTCTTCCAGTGCTTTTAGCC 
GTTCTTCATGCGGTGTTTACTTCAGCAATAGTCGTAATTCTGTATTTCCTATTAGGTACC 
GGATCTTTTGCGATTACGGATGAATCCATACTGGTGCAGGCAACGACAATTCAGCTTTTT 
GTTCTGATGTGTATCTTGTTGTCGCTAGTTGTGTCAACGACAGTCCAGCAGACATCAGCA 
CTGGTTGAAGAGCTAGAGGTGGTAGCGAAGACCCTTCCTGATGCGCTTTTTATCGTAAAC 
AAAAATGGAACAGCATTTCCTGTTAACGCAGGCGCGAAAAATTTCGTCAAGCAATCACCG 
GATGGGCATTATTCCATGCCGAAACTACAGAATATAGACGGTGAACCCATGGATGAGAAA 
GAAAGTCCGAGCAGTATGGCCTTGCGTGGACAAGGTGTCGAAGGAGTATTAGCCAAGTTA 
GGTGAAGTACTGGGAGAAGATCCGGACTTGGCGCGTCGAATCTTCGAAATTAGTGCCTCA 
CCGATGTATCTGCGTGGAGAAACTGAACCGGGTCATGCGCTCGTGATTTGGCATGACAGT 
ACTAATGAGTATTACACGATGCAACAATTGACGCTTGCATATGAAGAATCGCGGCTGCTA 
TTTGAAAAAGCCCCTCAAGGGATTGCCATGCTGGACCCTTCGGGAGAAATCGTAATGGCG 
AATCGATCCTTTGGTGACTTGGTGGGAACGACTCCTGTTCGACTCCTAGGACGAAATCTA 
GAGGATTTCGGAGTAGAGGAGGGAACCATGGAATACGTGACCCCTGTTCTGTCGGACCCA 
GAAGCCGTTGTGCACTTAGATCGTTCGCTCGAAACATTGAGAGGTAAACAGAAAAACGTT 
GCTATGTCATTTAGCTCGATGGGCAATGTTGGAGGCAGAATCGGAACTTTACTCGTTAAT 
GTTGTCGATGTAACCGAGCGCCAAGAACTCATCGAGCTTGTGGAGCATTTGGCGGATCAT 
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GACTCCCTGACAGGATTGGTCAATCGCAGGCGGCTGGAATCTGATATCGAAGAGCTTATC 
CTCAAGAATGAACGCGATTCGACCGATAGTGCATTGTTGCTTTTGGATCTGGATTACTTC 
AAGGAAGTTAATGATTCCCTCGGCCATGAGGCTGGTGACCAGTTGCTTATTGAGTTTGCT 
GAGATCCTCAAAGACAGCGTGAGGGATTCCGACATTGTCGGACGCATCGGCGGCGATGAA 
TTCGTTATTGTTTTGCCTGACACAGACAGGGATGGCGCTGAAGCAATCGGTATAAGAATT 
ATTGAGTTGGTCAATCAGCACTTCAAAGGCCGAGGAAAAGTGTTATCGCGGGTGTCAAGT 
AAAGTATCGGCGGGACGCTCTTTTCTGATGCTCGTGCCCAAGGTG 

>RXA018 61-downstream 
TGAATCCATTCATTCTTGCTGAT 

>RXA018 98-upstream 

CAACGGCACTTTTGCGCCCAATGATCGACGCCCCGGTGAACTAATCCGCCTCCTTGATAG 
CCTCCTAACGGCTGTCCGCGATTAAGGCTCTGAAATACTA 

>RXA018 98 

ATGAGTGTGAAAGCACATGAATCTGTCATGGATTGGGTCACCGAGGAGCTCCGCAGCGGT 
CGCCTAAAAATCGGTGACCACCTCCCCAGCGAACGGGCGCTCTCCGAAACCCTCGGAGTT 
TCCCGAAGCTCCCTGCGCGAGGCGCTTCGTGTGCTCGAAGCCCTCGGCACCATTTCCACC 
GCCACCGGATCCGGCCCGCGGTCTGGCACCATCATCACTGCTGCCCCTGGCCAGGCGCTT 
TCCCTCTCCGTGACGCTGCAGTTGGTCACCAACCAGGTCGGCCACCACGATATTTATGAA 
ACCCGCCAACTCCTTGAAGGCTGGGCTGCCCTGCATTCCAGCGCCGAACGTGGCGACTGG 
GACGTGGCAGAAGCGTTGCTGGAAAAGATGGACGACCCCTCGCTACCGCTCGAGGATTTT 
TTGCGTTTCGACGCCGAATTCCACGTTGTTATCTCCAAAGGCGCGGA7VAACCCTCTGATC 
AGTACGCTCATGGAAGCCCTCCGTTTGTCCGTGGCAGATCACACCGTTGCCAGGGCCCGG 
GCGCTCCCCGATTGGCGAGCCACCTCGGCGCGTCTGCAGAAAGAACACCGCGCAATCCTC 
GCAGCACTTCGCGCAGGCGAATCCACAGTGGCCGCAACCTTGATCAAAGAACACATCGAA 
GGCTACTACGAAGAAACCGCTGCCGCCGAGGCC 

>RXA018 98-downstream 
TAAATGTCCCGCACTCTGTGGGC 

>RXA01935-upstream 

ATACGAGAAACCAAAAATTACCGGCATCTAAATACTTATCCATGCCCATTTACAGACAAT 
GCGGTCTAGATAACAAGACCCCTATTAGACAGCGTTGTCT 

>RXA01935 

ATGATTGGCTATGGTTTACCTATGCCCAATCAGGCCCACTTCTCTGCGTCCTTTGCCCGC 
CCCTCTACCCCGGCTGCAAAGTGCATGCACCATATCCGCCTCGGCCAGCAACTCATTAGA 
AATGAGCTGGTCGAGGCCACAGGTCTGTCCCAACCGACTGTCACCCGCGCAGTCACCGCT 
TTAATGCAGGCAGGTTTGGTTCGTGAACGCCCTGATCTCACACTCTCATCGGGCCCTGGT 
CGTCCCAATATTCCTCTAGAACTCGCTCCAAGTCCATGGATTCATGCAGGCGTGGCAATC 
GGCACCAAGTCTTCCTACGTCGCTTTGTTTGATACCAAGGGTCGCACCCTTCGTGATGCC 
ATGCTGGAAATCTCAGCAGCTGATTTAGATCCAGACACTTTCATCGAACACCTCATTGCT 
GGTGTCAACCGCCTCACCACTGGTCTTGATCTACCACTGGTAGGTATTGGTGTTGCCACC 
TCAGGAAAAGTCACCAACGCAGGCGTTGTCACCGCAAGCAACTTGGGCTGGGATGGCGTT 
GATATCGCTGGCCGCCTGAACTACCAATTCAGCGTTCCAGCAACCGTGGCATCAGCAATT 
CCTGCCATCGCAGCTTCTGAACTGCAGGCTTCCCCACTTCCCCACCCTGAGCAGCCAACT 
CCCATCACCTTGACCTTCTACGCCGATGACTCTGTGGGCGCGGCCTACAGCAATGATTTG 
GGAGTACATGTCATTGGACCACTGGCTACAACTCGTGGATCAGGTTTGGATACTTTGGGC 
ATGGCTGCCGAAGATGCGCTGAGCACCCAAGGTTTCTTAAGCAGGGTTTCTGATCAGGGT 
ATCTTTGCCAACAGCCTTGGTGAGCTAGTCACCATTGCTAAAGACAATGAAACCGCACGG 
GAATTCCTCAACGATCGCGCGACCCTGCTGGCTCACACTGCCGCAGAAGCTGCCGAAACA 
GTTAAGCCATCCACCCTGGTTCTCTCGGGATCGGCGTTTTCCGAAGATCCACAAGGTCGG 
TCGGTGTTCGCTTCCCAATTGAAGAAGGAATACGACGCAGACATTGAGCTCCGGTTGATC 
CCCACCCACCGGGAAAACGTCCGCGCAGCAGCTCGAGCAGTCGCACTTGATCGACTACTC 
AACGAGCCACTTACTCTCGTACCC 

>RXA01935-downstream 
TAACCTCATCTAAGCTCAGTGCT 
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>RXA02127-upstream 

TTTGCTAATACCAGGGTAAACAGCCACTCCCAGGGCAATTGAAACACGGATTAGATGGAT 
TGAGGGAACATCCCAAACCCAAAGAACCTAGGAGAACATT 

>RXA02127 

ATGAGTGAGACAGTTTTAGTCATAGGAGCAACAGGAAGCATAGGCCGACATGTTGTCTCG 
GAAGCACTTAACCAGGGATACCi\AGTTAAGGCATTTGTCCGTAGCAAGTCCCGTGCACGG 
GTGCTTCCAGCTGAGGCAGAGATTATCGTAGGAGACCTGCTTGATCCTTCCTCGATTGAG 
AAAGCTGTAAAAGGCGTCGAGGGAATCATTTTCACTCACGGCACCTCCACTCGTAAAAGC 
GATGTGCGGGATGTTGATTACACCGGCGTTGCCAACACGTTGAAGGCAGTCAAGGGAAAA 
GATGTAAAAATTGTGCTGATGACCGCCGTTGGAACGACCCGCCCAGGTGTGGCTTATGCC 
GAGTGGAAGCGACATGGCGAGCAACTTGTTCGAGCTAGCGGACACGGTTACACCATTGTT 
CGCCCTGGTTGGTTTGATTACAACAACGATGACGAGCGTCAGATCGTCATGCTTCAAGGC 
GACACCAATCAGTCGGGTGGCCCAGCCGATGGCGTGATTGCGCGTGATCAAATCGCGCGA 
GTTTTGGTTAGCAGTTTGAATGATGCAAAAGCACGAAACAAAACCTTCGAGCTTTCTGCC 
ACTTATGGACCTGCCCAAGGAAAGCCTGACCGCAACTTTTGCAGCACTTCGGGC 

>RXA0212 7-downstream 
TGACGATACCGATGATATTGACG 

>RXA02210-upstream 

TTCATGGCCAATGTGGTGGGGTCCTTTCTAATACGCCAAATTTTTCAAACCCATCCTCTT 
CCATGCGTGTTTTCACCGCTATTTTCCATAGGAGTACATT 

>RXA02210 

GTGTCCGTAGCGGCAGGCGACAAACCAACAAATAGCCGTCAAGAAATCCTCGAAGGTGCC 
CGACGGTGCTTCGCTGAGCACGGCTATGAAGGCGCAACCGTACGCCGACTGGAAGAAGCA 
ACAGGTAAATCACGCGGAGCGATCTTTCATCACTTCGGTGACAAAGAAAACCTGTTCCTA 
GCCCTCGCGCGGGAAGATGCAGCCCGCATGGCGGAGGTGGTGTCTGAAAATGGCCTCGTT 
GAAGTGATGCGAGGAATGCTGGAAGATCCTGAACGATATGACTGGATGTCAGTACGCCTG 
GAGATCTCCAAGCAGCTGCGCACCGACCCGGTATTCCGCGCAAAATGGATTGATCACCAA 
AGTGTTCTAGACGAAGCTGTCCGCGTGCGTTTGTCCCGCAACGTGGATAAGGGACAAATG 
CGCACTGACGTCCCGATCGAAGTGCTGCACACCTTCTTAGAGACTGTTCTCGACGGTTTC 
ATCTCCCGTCTTGCTACCGGCGCATCCACAGAAGGACTGTCCGAAGTATTGGATCTGGTC 
GAGGGAACTGTCCGTAAACGCGAC 

>RXA02210-downstream 
TAAACGACCCCTGATTCACACTT 

>RXA022 32-upstreaiu 

AATAGAGCGCTCAATTTAGGTGTTTATGGTGGCTTGATCGCTCACGGCGTTGCCCGACAC 
CTAGCCTTGGCGACAGCAGACGTGAAACAATAGATAAACA 

>RXA02232 

ATGGATGAAAAGAAAAACTTAAGTCATGATGAACTTCTCGCTCAGGCCTTCCGCGGTCAC 
AAAAATACCGTGCGCCCAGGATCTGACGAGACCTCAGGTTTTGATCTCAGTGGTTTTATC 
CGAGCTGAAGAACCATCAACTGGTGATCTCGACCTAGAGGCCCGCGATGCCCAACGTCGC 
CGGGACACCGAAATCCACGCTGATGAAGCAGCAGATGGCTACGAGGTTGAGTACCGAAAG 
CTGCGACTTGAGCGCGTTATCTTAGTGGGCGTGTGGACCGAAGGTACCACCGCAGAAATT 
GACGCCAGCCTTGCGGAACTTGCAGCGTTGGCTGATACCGCCGGCGCTGAGGTTATTGAA 
ACGCTGTACCAAAAGCGCGATAAACCAGATCCTGGAACCTACATTGGTTCCGGCAAGGTT 
CGGGAGTTAAAGGAGATCATCGAAGCCACTAGTGCAGATACCGTGGTGTGCGATGGTGAA 
CTTAGCCCTTCCCAGCTCGTGGCATTAGAGCGCGAACTTGATATCAAGGTCATTGACCGC 
ACCATGCTGATTCTGGATATCTTCGCCCAGCACGCTAAATCGCGCGAAGGTAAAGCCCAA 
GTCGCGTTGGCGCAGATGGAATACCTGATTAGCCGTGTGCGTGGTTGGGGTGGAAACCTC 
TCCAGGCAGGCCGGTGGTCGTGCAGGTTCTAATGGTGGTGTGGGTCTGCGTGGTCCAGGT 
GAAACCAAAATTGAAGCAGACCGCCGTCGTCTTCGATCGGATATGGCTCGCCTGCGCAGG 
GAACTTTCGGGGCTGGATACGTCGAGAAGCATTAAAAGAGCGCAACGCGCAGCCTCCCTG 
GTGCCGCAGATCGCCATCGCTGGCTACACGAACGCCGGCAAATCTTCGCTGATTAACGCG 
ATGACCGGCGCGGGTGTGCTGGTGGAGAACGCGCTGTTCGCCACGCTTGATCCAACAACC 
AGAAAAGCCGAGCTTGCCGACGGCCGACACGTCGTGTTCACGGACACCGTCGGCTTTGTG 
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CGACACCTGCCGACCTCTCTGGTTGAGGCGTTCAAATCTACGCTGGAAGAAGTCGTGGAG 
GCGGACCTCATGCTGCACGTGGTGGATGGATCCGATCCGTTCCCGCTGAAGCAGATCGAC 
GCTGTGAACACCGTGATTAGCGATATTGTGCGATCCACCGGTGCGGTGCCACCACCAGAG 
ATCATCGTGGTGAACAAAATTGACCAAGCTGATCCGCTGACGCTGGCAGAACTACGCCAC 
GCCGTCGACGATGTGGTGTTTGTCTCTGCGCTGACAGGGGAGGGAATTAAGGAGCTGGAA 
GCTCGCATCGAACTATTCCTCAACTCCAGGGACGCGCACCTACTGCTGAAAATCCCGTTC 
ACCCGTGGCGATATTGTGTCCCGCCTGCACCAGCATGGCACCGTTCTCAGCGAAGACTAC 
GCCGAAGACGGCACCTTGATGGATGTGCGTATCCCCACCCAATTGGCCCAAGAGCTGCAG 
AGTTACGTTGTAGT^CCCACCTCTGCC 

>RXA02 2 32 -downstream 
TAACTGTCGATTTCCCAAGAGCC 

>RXA022 7 0-upstream 

CAGCTTGAGCAGGCATCGGCCGATGACGTACCAGGAGTTTCGGCATGATCCGGCGGCGTC 
GCATCGCTATTGGGCGCGGTCGTTTGTGGGGTGGCGGGTG 

>RXA022 7 0 

ATGGATCAGGCGCGGCCGAATCGAACGCACTACGCCATGGTTGAGCTGGAGCAGCATGGT 
TTTTTAAGTGGTGTGGTCACCCAAAATGTCGATGGTTTACACGCGGAAGCAGGCACGAAA 
AACCTGGTCGCGCTGCATGGTGATCTCGCCCATGTGATGTGTTTGAACTGCGGTTTCGGG 
GAGGATCGACACCTCTTTGATGAACGTCTCGAAGCCGCCAACCCCGGCTACGTCGCTTCC 
ATTCGCCTGGAACCGGGCGCAGTCAACCCCGACGGCGACGTCTTCCTCGACGAAGAACAA 
GTACGCCGCTTCACCATGATCGGCTGCTTGCGCTGCGGCTCGCTCATGCTCAAACCAGAC 
GTGGTTTACTTCGGCGAACCCGTGCCCGCCGCGCGCAAAAAAGATTTAAAAAAGCTTCTC 
GACGCCTCCTCCAGCCTCTTAATCGCCGGCTCCTCCCTAGCCGTCATGAGTGGATACCGG 
ATCGTCATCGAAGCGCAACGTCAAGGAAAACAAGTGTCTGTCATCAACGGCGGCCCAGGT 
CGGGCGGATTCGCGCGTGGACATTTTGTGGCGCACCCGCGTTGCACCGGCCTTTGATGAC 
ATTTTGGACGCGCTGGACCTT 

>RXA02 27 0-downstream 
TAGACTTTTGGTGGCTTAAGTTC 

>RXA02 30 6-upstream 

ATCGGGCTGGGAATGTTCATGATCTTCGAGGGGGTTGTAGGAATCGGTGGCAGGGTAGTG 
GGTTAGCCCCGCCCCCAGGACGTCACTAGACTTAGGCACT 

>RXA02 30 6 

ATGCAACCTGAAGAAGTGCACATCAAGGACGAGACCATCAAGTTAGGTCAGTTCATCAAA 
TTGGCCAACCTTGTCGAATCAGGCGGAGCGGCCAAAGATGCCATCGCTAACGGTGATGTC 
ACCGTCAATGGTGAAGTGGATACCCGAAGGGGTAAGACACTTCGCGATGGCGATGTGGTG 
TGCATCGGCGAGGTATGCGCGCAGGTGTCTACTGGTGACGCAGCCGACGACGATTATTTT 
GACGAAGCCACCGCAAACGATGACTTCGATCCCGAAAAGTGGAGGAACATG 

>RXA02 306-downstream 
TAATGCCAGCCTTTGAGGCAATG 

>RXA0237 6-upstream 

ACTGGTAGGATCGAAGAACCTTGGCGCCCACACATCCCCCTTTATTTTTCTACTTATACG 
TCGGCGCCCGTTTTATCCCGATCAGGAGTGACACCGTTTT 

>RXA0237 6 

ATGAACCGATTTATTGACCGCGTTGTGCTACACCTCGCCGCAGGTGACGGCGGCAATGGT 
TGTGTCTCGGTGCACCGCGAAAAATTCAAGCCACTTGGTGGACCAGACGGCGGCAACGGC 
GGCCACGGTGGAGACATCATCTTGGAAGTCACCGCACAGGTCCACACCCTGCTTGACTTC 
CACTTCCACCCACACGTGAAGGCCGAGCGCGGCGCTAACGGCGCTGGCGATCATCGCAAC 
GGTGCCCGAGGCAAGGACCTTGTCTTGGAAGTTCCACCAGGAACTGTCGTGCTTAATGAA 
AAGGGCGAGACTCTGGCAGACCTGACCAGCGTGGGCATGAAGTTCATCGCTGCTGCTGGC 
GGTAACGGCGGTTTGGGTAACGCAGCGCTTGCCTCCAAGGCTCGTAAGGCCCCAGGCTTC 
GCCCTGATCGGTGAGCCAGGCGAGGCCCACGACTTGATTCTTGAACTCAAATCCATGGCA 
GATGTGGGATTGGTGGGCTTCCCATCAGCCGGCAAATCATCACTGATTTCTGTGATGTCT 
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GCAGCAAAGCCAAAGATCGGTGATTACCCATTCACCACCCTGCAGCCAAACCTCGGCGTA 
GTTAACGTTGGTCATGAGACTTTCACCATGGCAGACGTGCCTGGTTTGATCCCTGGTGCT 
TCTGAGGGCAAGGGCTTGGGTCTGGATTTCTTGCGCCACATTGAGCGCACCTCCGTGCTG 
GTTCACGTTGTCGATACCGCAACGATGGATCCAGGCCGCGATCCGATCTCTGATATTGAG 
GCTTTGGAAGCAGAACTTGCCGCCTACCAGTCGGCTTTGGATGAAGACACCGGACTTGGT 
GACTTGAGCCAGCGCCCTCGCCTTGTTGTGTTGAACAAGGCTGATGTCCCTGAGGCTGAA 
GAGCTTGCTGAGTTCCTCAAAGAAGATATTGAGAAGCAATTCGGATGGCCCGTGTTCATT 
ATCTCCGCAGTGGCACGCAAGGGCTTGGATCCTTTGAAGTACAAGCTGCTGGAAATCGTC 
CAGGATGCCCGAAAGAAGCGTCCAAAGGAGAAGGCTGAGTCTGTCATCATTAAGCCTAAG 
GCTGTTGATCACCGCACTAAGGGGCAGTTCCAGATCAAGCCTGACCCAGAGGTTCAGGGC 
GGATTCATCATCACCGGCGAAAAGCCAGAGCGCTGGATTTTGCAGACCGACTTTGAAAAC 
GACGAAGCAGTTGGCTACCTGGCTGACCGTCTGGCCAAGTTGGGCATTGAGGACGGGCTT 
CGTAAGGCAGGAGCACATGTGGGTGC7VAACGTCACCATCGGAGGCATTTCCTTCGAGTGG 
GAGCCAATGACCACCGCTGGCGACGATCCAGTCCTTACCGGACGTGGCACCGATGTGCGC 
CTTGAACAGACCTCTCGTATCTCTGCTGCAGAGCGTAAACGCGCATCTCAGGTACGTCGT 
GGCCTCATCGATGAGTTGGATTATGGCGAGGACCAAGAGGCTTCCCGCGAACGCTGGGAA 
GGA 

>RXA0237 6-downstream 
TAAAACCGAGCACTTTTCAGGTC 



>RXA023 92-upstream 

AAAACTGCTGATGGTTTGGCAGTTCAGGGCTGTTTAACTGTTTGGGTTTGCTCAGGTAGA 
GTAGGACACCGTTATTTAATTGAAGGGACCCCTGACGCAT 

>RXA02 3 92 

ATGGCAGAGAAATTTGCAGAAACAACATTTACGGATCCAGCCAGGATTCGTAACTTCTGC 
ATCATTGCCCACATTGACCACGGTAAATCTACGCTCGCTGACCGTATCCTGCAGCTGTCT 
AACGTTGTGGATGCCCGCGATATGCGTGATCAGTACCTGGACAACATGGACATCGAACGT 
GAACGTGGCATTACCATTAAGGCTCAGAACGTTCGCCTGCCATGGATTCCTCGCAGTGGT 
GAGTACGAGGGCCAGCAGATCGTCATGCAGATGATCGATACGCCAGGCCACGTGGACTTC 
ACCTATGAAGTGTCTCGGGCGCTTGAAGCGTGTGAAGGCGCGATTTTGCTTGTTGATGCA 
GCGCAGGGCATTGAAGCCCAGACCTTGGCAAACTTGTATTTGGCTATGGAAAACGATCTT 
GAGATCATCCCTGTGCTGAACAAGATTGACCTTCCAGCGGCGGATCCAGACAAGTACGCG 
TTGGAGATCGCCAACATTGTGGGTTGTGAACCTGAAGATGTGTTGCGCGTGTCCGGTAAA 
ACTGGCATGGGTGTCCCTGAGCTTCTGGATAAGGTCGTTGAACTTATCCCAGCACCTACC 
TCTGAATTTGAGGAAGACGCCCCAGCTCGTGCGATGATTTTCGACTCTGTCTATGACACC 
TACCGCGGCGTGGTTACCTACATCCGCATGATGGACGGCAAGCTGACACCTCGCCAAAAG 
ATCAAGATGATGTCCACCGGCGCCACCCACGAGTTGCTGGAAATCGGCATCGTGAGCCCC 
ACCCCTAAAAAGTGTGTGGGTCTTGGACCTGGCGAGGTTGGTTACCTGATCACCGGTGTG 
AAGGACGTGCGCCAATCTAAGGTGGGCGATACCGTCACGTGGGCAATTCATGGAGCTGAG 
CAGCCACTGCGCGGTTACCAGGAACCAACACCGATGGTTTACTCGGGCTTGTTCCCGATT 
TCCCAAGCGGATTTCCCCGACTTGCGCGATGCGCTTGAAAAGCTGCAGCTTAACGACGCC 
TCCCTCACGTACGAACCCGAAACGTCCGTAGCACTGGGCTTTGGTTTCCGATGTGGCTTC 
CTCGGACTGCTGCACATGGAAATCACCCGTGACCGACTCGAACGTGAGTTTGGCCTTGAT 
CTGATTTCTACCGCGCCATCTGTTAACTACCGCGTTATTGATGAGGCGGGCAAGGAATTC 
CGCGTCCACAACCCATCTGACTGGCCTGGCGGAAAGCTCAGTGAAGTTTACGAGCCCATC 

>RXA024 50-upstream 

CCAGACCGATCTCCATCAATTCACAGGCGAAGCGAACCGAAAACGATTGCGTTGTTCTAC 
ACTGATCAGAGCCCGTCCCTCAACAAGAAGAGCAACACCA 

>RXA02450 

ATGAATCTGAAAGATCTCAAGGCCGCAGAGACCCGTCAAAGGTTTATCGATGTAGCCCAC 
GAACTCTTCTTGGAGCACGGTTATGGTTCCACCTCCATGAATCAGATTGCTCAGGCAGCG 
GGTGGTAGCCGGGCAAACCTTTACCTTCATTTCCGTAACAAGCCCGATCTCATGATGGCT 
AAAATGCGGGAACTTGAACCCGCGGTCCGCACCCCTGTCCTAAAAGTTTTTGATCTCCCT 
GAACACACTTTGGAGTCCATTCTTAGATGGCTGGACTCCATGACGGAGGTGTGGAAAGCG 
AATGCCAAAGTGTTCGGGGCGATGGAACAAGCGATGGTCGAAGATGCTGCGGTGGCCGAT 
GAGTGGCTTTCAATGATGCAGAGGTTGAGCCAATCGGTGCCCGAATTGGTTGAGAATGAA 
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GAGCGTCGAGTTCAGTTCCTGGCTAGCTTGATGGGCATGGATAGAAACTTTTACTTCCTC 
TATGTCCGAGGGCAAGATGTTGATGAGGAATTGCTAAAGTTGGCTGTGGCTCGCCAATGG 
TTGGCAGTTTTCCAA 

>RXA024 50-downstream 
TAGGCAATGCGCCCCAATCCCCT 

>RXA024 93-upstream 

GGTTCCGTAGTAAACCCAGGCGGCACCTACCTCGATCCTGAGGCAGCAGCAGCCGGCGCA 
GCAGCAGTAGCAAACCAGGGTAATAAGTAGCTATTTGTAG 

>RXA02 4 93 

GTGAGCACTCTTCTTGCTTTCGTATTGGGCGTGGTCCTCATGGGCCTCGCCCTACCTGCG 
TATACGAAAATTAAAGATCGGATGCGTCGCCACAAGTCCGCGGTCACCCTGTCCGAAAAC 
CAGGTCACCACGGTGGGGCAGGTCCTCCACCTGGCGATTCAAGGCTCCCCAACGGGAATC 
ACGGTTGTCGATCGCACCGGCGACGTCATCTTATCCAACGGCCGCGCCCACGAATTGGGC 
ATCGTCCACGAAAGATCCGTCGACGGCAACGTTTGGCGCGTCGCCCAGGAAGCCTTCCAA 
GACCAAGAAACCCACTCACTCGACGTCCACCCAGACCGCAATCCGCGGCGCCCGGGTAGT 
CGCATCACCGCAGTGCAGGCAGTGGTCAAGCCTTTAACGCTTATCGACGATCGTTTCGTG 
ATCATCTATGCCTCCGACGAATCCGAAAACGTGCGCATGGAATCGGCACGCCGAGACTTC 
GTCGCAAACGTCTCCCACGAACTGAAAACCCCCGTCGGCGGCATGGCACTCCTCGCGGAA 
GCCCTCATGGAATCCTCCGACGACCCAGAACAAGTCGAATACTTCGGATCCAGGCTCCAC 
CGCGAAGCCCACCGCATGGCCGACATGATCAACGAACTGATCTCCCTTTCCAAACTTCAG 
GGCGCCGAACGACTCCCTGATATGGAACCCGTCCAGGCTGACGACATCATCAGCGAAGCC 
ATCGAACGCACCCAACTCGCCGCCGACAACGCCAACATCGAAATCATTCGCGGCGACCGC 
ACCGGCGTTTGGGTAGAAGCCGATCGATCCCTGCTGGTCACAGCCCTGGCGAACCTGATC 
AGCAATGCAATCAACTACTCACCAAAATCAGTCCCCGTCTCCGTTTCACAAAGCATCCGA 
AACGACGTGGTCATGATCCGAGTAACCGACCGTGGCATTGGCATCGCACCCGAAGACCAA 
GGCCGAGTTTTCGAAAGATTCTTCCGCGTCGACAAAGCCCGCTCCCGCCAAACCGGCGGA 
ACTGGCCTTGGCCTCGCGATAGTCAAACATGTCATGGCTAACCATGGCGGTAGTATTAGT 
TTGTGGTCACGTCCTGGAACAGGCTCCACATTTACACTTGAACTCCCTGTATACCACCCA 
GAGTCCAAGGAACCGGCAGGATCTAAGCAGGGACCTAGTTTGGATTCACCTATTCGTACG 
ACTGCGTCCAAAGCATCTGGGCGCCGAAAGGAAAAATCA 

>RXA024 93-downstreain 
TGACGAGAATCCTGATCGTTGAA 

>RXA024 94-upstream 

CAGAGTCCAAGGAACCGGCAGGATCTAAGCAGGGACCTAGTTTGGATTCACCTATTCGTA 
CGACTGCGTCCAAAGCATCTGGGCGCCGAAAGGAAAAATC 

>RXA024 94 

ATGACGAGAATCCTGATCGTTGAAGATGAGGAATCGTTAGCAGATCCTTTGGCCTTTCTT 
CTTCGCAAAGAAGGTTTTGACACCATCATCGCCGGTGATGGCCCAACCGCACTTGTGGAG 
TTCAGTCGCAACGAAATCGACATCGTCCTCTTAGACCTCATGCTCCCAGGCATGTCTGGC 
ACCGACGTATGCAAAGAACTCCGCAGCGTATCCACTGTTCCCGTCATCATGGTCACCGCC 
CGCGACTCCGAGATCGACAAAGTTGTTGGCCTCGAACTCGGCGCCGATGATTATGTAACC 
AAGCCATATTCTTCCCGCGAACTCATCGCCCGCATCCGCGCTGTCCTGCGCCGACGCGGA 
GTTACTGAAACCGAAGCCGAAGAATTACCACTTGACGATCAAATCCTCGAAGGCGGCCGC 
GTCCGCATGGACGTCGATTCCCACACCGTCACCGTCGGTGGCGAACCAGTGAGCATGCCA 
CTGAAGGAATTCGACCTTCTGGAGTACCTCCTCCGAAACGCCGGCCGAGTCCTCACCCGC 
GGACAGCTCATCGACCGAATTTGGGGCGCAGATTACGTCGGCGACACCAAAACCCTCGAC 
GTTCATGTCAAAAGGTTGCGTTCCAAGATCGAAGAAGAGCCATCTCGCCCTCGTTACCTC 
GTGACCGTGCGTGGATTGGGCTACAAATTCGAGCTG 

>RXA024 94-downstream 
TAGGGCTCTGTTAGGCCCTGTTG 

>RXA02 631-upstream 

TCGGATACGTCCTGCGAGAGACCGCTCCGTGACATTAAGGCGAATCGGCGCAGGGGAAAA 
TGGGCCTGCCCCTACCGAAAGTGATGACTCCGACGGTTCA 
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>RXA02 631 

ATGTCGTTGCGTTGGCGCTTGGCTTTGCTGAGCGCCACTTTGGTAGCTTTCGCCGTTGGT 
GTTATTACTGTTGCTGCATATTGGTCTGTCTCCAGCTATGTCACCAACTCT^TCGATCGT 
GATCTGGAAAAACAAGCGGATGCAATGCTTGGACGAGCCAGTGAAGCGGGATTCTATGCA 
ACCGCAGAAACCGAAATTGCTCTGTTAGGTGAATATGCCAGTGACACTCGAATCGCCTTA 
ATCCCACCTGGGTGGGAATACGTCATCGGTGAATCCATATCACTGCCTGATTCAGATTTC 
CTTAAGAGTAAAGAAGCGGGGAAACAGATCCTCGTAACAAGTGCTGAGCGCATTCTCATG 
AAACGAGATAGCTCGGGCACAGTGGTGGTTTTTGCTAAAGATATGGTGGATACCGATCGG 
CAGCTCACGGTGCTTGGCGTCATTCTCTTGATCATTGGCGGCAGTGGTGTTTTGGCGTCG 
ATTCTGCTTGGTTTCATCATTGCGAAGGAGGGGCTGAAACCACTGTCAAAGCTGCAGCGT 
GCCGTCGAAGAGATCGAACGAACTGATGAGCTTCGTGCGATTCCCGTGGTGGGAAATGAT 
GAGTTCGCTAAGTTGACTCGTAGTTTCAATGACATGCTCAAGGCACTGCGGGAGTCTCGT 
ACCCGGCAATCTCAGTTGGTGGCAGATGCAGGACACGAGCTGAAAACTCCACTGACCTCA 
ATGCGGACAAATATTGAATTGCTGTTGATGGCAACCAACAGTGGAGGATCGGGAATCCCC 
AAGGAAGAATTGGATGGCCTTCAGCGTGATGTATTGGCGCAGATGACCGAAATGTCTGAT 
TTGATTGGTGATCTTGTTGATCTTGCGCGTGAAGAAACCGCCGAAACGTCAAGCATTGTA 
GATCTCAACCAAGTGTTGGAAATTGCGCTTGACCGAATGGAAAGCCGTCGCATGACGGTG 
CGGATAGATGTTTCCGAGACTGTGGATTGGAAACTGCTGGGCGATGATTTTTCCTTAACC 
AGGGCATTAGTAAATGTTTTGGATAATGCCATTAAATGGTCGCCTGAGAATGGCATTGTT 
CGAGTGTCGATGTCACAGATCGACAAAGCAACGGTCCGCATTGTTATTGATGATTCAGGG 
CCTGGAATTGCTGAAAAAGAACGAGGATTAGTTTTGGAACGGTTCTATCGCGCCGTCAGC 
TCCCGTTCCATGCCGGGATCGGGATTAGGTCTTGCCATCGTGAATCAGGTTGTGAATCGG 
CATGGTGGCCAACTCGTTGTGGGTGAATCAGATGATGGCGGAACGAGAATCACTATTGAT 
TTGCCAGGGGAACCCATTCGCAGCGGGTTCGAAAATGTCGATGAT 

>RXA02 631-downstream 

T AAAC C AC T AAAGAG C T C AC AGG 

>RXA02 632-upstream 

TATGCTTAAGAGGTGTTAGCATAAGTGAAATATGTTCCAACGCGTGGACGTCTTAATTGG 
GAGGAAGTCTGTCACGGACTGGAAGACGAAAAGGGTATCG 

>RXA02 632 

ATGAAAATTTTAGTTGTTGATGACGAGCAAGCTGTACGTGACTCCTTGCGACGTTCCCTT 
TCGTTCAACGGATACAACGTTGTTCTCGCAGAAGACGGCATCCAAGCACTAGAGATGATT 
GACAAGGAACAGCCTGCTTTGGTGATCCTCGATGTCATGATGCCTGGTATGGACGGACTT 
GAGGTCTGTCGCCACCTTCGCAGCGAAGGCGATGATCGGCCAATTCTTATTCTTACTGCC 
CGCGATAATGTTTCTGATCGTGTTGGTGGCCTCGATGCAGGCGCAGATGACTATTTGGCT 
AAACCATTTGCTCTTGAAGAGCTGTTGGCGCGCGTCCGTTCACTGGTGCGTCGCTCTGCA 
GTGGAATCAAATCAGAGTTCCAGCATTGAACAGGCTCTATTATCTTGTGGCGATTTGACG 
CTTGACCCAGAAAGTCGAGATGTCTACCGCAACGGACGCGCCATCAGCCTTACTCGAACA 
GAGTTCGCGCTCCTGCAATTGCTCCTCAAAAACCAAAGGAAAGTGCTCACTCGCGCCCAG 
ATTTTGGAAGAGGTATGGGGCTGCGATTTCCCCACTTCAGGCAATGCCCTCGAGGTCTAC 
ATTGGATACCTTCGACGCAAGACTGAATTGGAAGGAGAAGACCGCCTGATCCATACAGTA 
CGAGGAGTCGGATACGTCCTGCGAGAGACCGCTCCG 

>RXA02 632-downstream 
TGACATTAAGGCGAATCGGCGCA 

>RXA02 6 67 -upstream 

GCGCAGCAGGGTGAAGTATTAAATAGGTCGCTTGCATGGACAATGCGGGCGCCACCGGCT 
GGTTCTAGCAAAAATGGAACCGCCATTAGAAGGAGTGGGA 

>RXA02 667 

ATGGAATTCAAGGTCGGAGATACCGTCGTTTACCCGCACCACGGAGCTGCAATTATTTCA 
GCCCTGGAGCAGCGTGAAATGAATGGTGAGACGGTGGACTACCTGGTTCTCCAGATCAAT 
CATTCCGATCTCGTCGTTCGCGTTCCAGCAAAGAACGCTGAACTCGTTGGCGTGCGTGAC 
GTTGTCGGCGAGGAGGGCCTGCAGAAGGTTTTCTCTGTTCTTCGTGAAATTGACGTCGAA 
GAAGCCGGCAACTGGTCCCGCCGTTACAAGGCTAACCAGGAGCGTTTGGCTTCCGGTGAC 
GTGAACAAGGTCGCTGAGGTTGTCCGTGACCTGTGGCGTCGTGATCAGGATCGTGGCCTT 
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TCCGCTGGTGAGAAGCGCATGCTCTCCAAGGCGCGTCAGGTTCTTGTTGGTGAGCTCGCG 
CTCGCCGAAACCGTGGACGATGAGAAGGCGGATGCTTTCCTCAGCCAGGTCGATGAGACC 
ATTGCTCGCCACCGCGCTGACCTGCTCGGCGACGAGGAAGAGAAGAAGGACGCATTCGAC 
GACTTCGACGATTCCGACGTGGATCTTGACGATCTGAGCTTCGACGACGAAGAT 

>RXA02 6 67 -downstream 
TAGACGCCCATGTCGTCTACACG 

>RXA02 668 -upstream 

CGGCCCTCCACCCGCAATCTACGGTTGAGCGGCCGAGGCCAAGGCAGCCGTTTCTGCTCC 
TGTTGTTAATCAGGTW^CAGAAAAGGCCATACTTGAAACC 

>RXA02 668 

ATGACGAACCCCTCCCCCGCGCTAAATGAAACCCTTTCCGGCAGGGTGCTGATCGTTGAA 
GATGAGCGCCCTCTTGCTCGCATGATTTCGCTTTATTTAAGCAAAGCGGGTTTCGATACC 
ACCACGATCCACGACGGCGCCGCTGCTCCAGATAAGGTCGCTCACCTGCGCCCCGACGTG 
GTCATTTTGGATCTTGGGCTGCCTGGTCTTGATGGTTTGGAAGTGTGCAAACGCATCCGC 
GCGTTCACCGATTGCTACATCCTAATGCTCACCGCCAGGGGTTCAGAGCGGGATCGGATT 
ACAGGTTTGGAAATTGGGGCTGATGATTACATCACCAAGCCGTTTAATATCCGCGAACTT 
GTCATTCGTATCCAGTCAGTAATGCGTCGCCCTCGAAAAATCGATGAAACCATCCAAAAT 
GGTTTGACCTTGACTTATGGCCACATTGAGCTGGACACCTTGGCGCATGAAGTCACTGTC 
AAAGGCGTTGGGGTGACACTGACCCGCACAGAATTTGAGCTGCTTCAAGCCCTCATGCAC 
AAACCGGGAGAGGCAGTGTCTAGGCGTGATTTGGTCAGCCAAGTGTGGGATACCACCTGG 
GTTGGCGATGAACGCATCGTTGACGTGCACATTGGAAATCTGCGCCGCAAGCTGGAAGCA 
CCTGCGCCGGGTTCACACTTCATCGACACCATCCGAGGTGTTGGCTACCGGATGGCCTTC 
AAA 

>RXA02 6 68 -downstream 
TGACAGCCCTCATCCCAGCTCGC 

>RXA02 6 6 9-upstream 

ACGTGCACATTGGAAATCTGCGCCGCAAGCTGGAAGCACCTGCGCCGGGTTCACACTTCA 
TCGACACCATCCGAGGTGTTGGCTACCGGATGGCCTTCAA 

>RXA02 669 

ATGACAGCCCTCATCCCAGCTCGCCACAGCCTGACTTTTCGTCTGCTCACCGCGCAGCTT 
GCTGTGGTGTTGATCAGTCTGCTGGCCGCCCTGATTGTGGCTGCCTTGGTAGGGCCTGCA 
ATTTTCAATTCTCACCTGGATCTTTCCGGCCCGATTGATCCCCGCCAGACGGATTTCCAC 
ATTCAGGAGGCCTACCGGGACGCCAATTACATTGCCCTCGCAGCGGCACTTCCCACCGCA 
GTGTTGAGCTCCATTGGTGTGAGTTTTTGGCTTTCCCACCGCCTGGGCCAGCCGTTGTGG 
CGACTGTCCCGGGCTGCAACTGCCATGAGCTCCGGCGACTACCAGGTGCGCGTACCCATT 
TCCGATGTGGATAAAGAGGTCGCTGCTCTATCTCTCGCCTTCAATTCCATGGCGGATCAG 
CTCGAACACACAGAAGAACTCCGCCGAAACATGCTCTCCGATCTATCCCATGAAATGAAC 
ACTCCCCTTTCCGTCCTCCTTGTTTATGTCGACGGTTTGCAGGACGGCATGGTGGAGTGG 
GACGCCGACACCCACGCAGTTTTCGCCGAGCAACTTGGCCGGCTTTCCCGCCTCACATCA 
GATCTTGATGATGTCTCTAGAGCCCAAGAACACCGCTTCGACCTGGTCTACAGCACCGTC 
GCCATCGGTGGTCTCATTCACAATGCCGCCGGAGCCGCCGCAGGTTCCTACCAAGAAAAA 
GGCGTGGCCCTGGAAGTAACAGGCAGCGATTCCACCGAACTCATCCGCGTTGATAGCCAA 
CGCTTCGCCCAAGTCATGGCCAACCTCTTCTCCAACGCCTTGCGGCACACCCCCGCCGGT 
GGGAAAGTTCACGTCCGCGTCCTGCGTCAAGGCGTGGGAACCATCGTCATCGAAGTCATA 
GACAACGGCGAAGGAATCGCCCCTGAACACGTAAAATACGTTTTCGAACGCTACTTCCGC 
GCCAAACGATCCGACTCCGACGACCAATCCGGCTCCGGAATCGGCCTCACCATCTCCCGC 
GCACTCATCGAAGCGCAAGGTGGCACACTAACCGCAGAATCCGCTGGCCTGGGCAAAGGC 
GCGAAATTTACCATCCGACTACCCCTTTTAAGCAAA 

>RXA02 669-downstream 
TAAAAAAATTGCTTTTCGACGCT 

>RXA02 698-upstream 

CATCCTGTTTTCAAAGGTC/^GAAGGTGCTTTCATTCCCGGCACGCGCAGAAATACCCTA 
AAGATTCTCCATTAGAGCTCGAACCAGCTAAATTAAGACT 
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>RXA02698 

GTGAGTTCCAACAATGAATCTTCCTTCGCCCTGCCCGACAATGAACCATTGCTGACCCTT 
CCGGAGACAGCCGAGCGCCTCGGCGTTGTTGTCACCAAGGTGATGGATCTGGTCAATGAA 
CACAAATTGATCGTGGTCCGGCGCGACGGTATTCGCTACATTCCAGAAGCTTTCCTGAGC 
ACCAAGAAGGAAAACACCAACCGTTTCATCCCTGGAGTTATTGCCTTGCTTGCCGACGGT 
GGCTTCAGCGACGAGGAAATCCTCGCGTTCCTGTTTACCGAAGACGAGACCCTTCCTGGT 
CGCCCCATCGATGCACTTCATGGCCAGTTGGCTCGTGT^GTTATGCGACGCGCTCAAGCA 
ATGGCGTTC 

>RXA02 6 98 -downstream 
TAAGCGCTTTCTAAAAGATCTAA 

>RXA02 699-upstream 

TTGTGTACCTTCCGACATACTGGAACGCATGGCAAACTTGAAGGTCGGTGACGTTTTAGA 
GGACAGGTATCGGATTGAAACTCCGATTGCCCGGGGTGGT 

>RXA02 699 

ATGTCTACCGTGTACAGGTGCCTTGATCTTCGTTTAGGACGTTCCATGGCGCTTAAAGTC 
ATGGAAGAAGATTTCGTTGATGATCCCATTTTCCGGCAGCGTTCCCGTAGGGAAGCTCGG 
TCAATGGCGCAGCTAAATCATCCAAATTTGGTCAATGTGTATGATTTTTCCGCTACTGAC 
GGTTTGGTGTATCTGGTGATGGAGTTAATCACTGGTGGCACCTTGCGTGAGTTGCTGGCT 
GAGCGGGGACCTATGCCCCCGCATGCTGCTGTGGGCGTTATGCGTGGGGTGCTCACGGGT 
CTCGCGGCTGCCCACCGGGCGGGCATGGTGCACCGGGATATCAAGCCTGACAACGTGTTG 
ATCAATAGTGATCACCAGGTGAAACTGTCTGATTTCGGCTTGGTTCGAGCGGCTCACGCC 
GGCCAGTCTCAGGACAATCAGATTGTGGGCACGGTGGCTTATCTTTCCCCTGAGCAGGTT 
GAGGGCGGTGAGATCGGGCCGGCCAGCGACGTGTATTCGGCAGGCATTGTGCTCTTTGAG 
CTGCTCACAGGCACCACGCCTTTTTCGGGCGAGGATGATCTCGACCATGCATACGCCCGC 
CTTACGGAAGTCGTGCCGGCACCGAGTTCGCTTATCGACGGCGTCCCCTCCCTCATCGAT 
GAGCTTGTCGCGACAGCTACCTCCATTAATCCTGAGGATCGTTTCGATGATTCTGGAGAG 
TTTTTGTCCGCACTGGAAGATGTCGCAACAGAGTTGAGCTTGCCGGCTTTCCGGGTCCCT 
GTGCCGGTTAATTCCGCAGCCAATAGGGCTAATGCCCAGGTCCCGGATGCTCAGCCAACT 
GATATGTTTACCACCCATATCCCCAAGACTCCTGAGCCTGATCACACTGCGATCATTCCG 
GTGGCCTCAGCAAATGAGACGTCGATTCTGCCTGCGCAAAACATGGCACAAAATATGGCG 
CAGAATCCGCTGCAACCTCCGGAACCTGATTTCGCCCCGGAGCCACCTCCGGACACAGCG 
CTGAATATTCAAGATCAAGAGCTTGCGCGCGCCGATGAGCCAGAAATTAATACCGTCAGC 
AATCGTTCCAAATTGAAGCTGACGTTGTGGTCAATTTTCGTGGTCGCAGTGATCGCTGCT 
GTTGCTGTTGGCGGTTGGTGGTTCGGTTCAGGCCGTTACGGTGAGATTCCGCAGGTGTTG 
GGCATGGATGAGGTCCAGGCAGTAGCTGTTGTAGAGGAAGCTGGTTTCGTGGCAGTGGCT 
GAACCTCAGTATGACAATGAGGTTCCCACTGGTTCGATTATTGGGACTGAACCTTCTTTT 
GGTGAGCGCCTTCCTCGCGGCGAGGATGTTTCTGTCCTCGTCTCTCAAGGGCGTCCCGTG 
GTGCCGGATCTTAGCGAGGATCGATCCTTAAGCACCGTTCGTGAAGAGTTGGAACAGCGC 
ACGTTCGTCTGGGTTGATGGCCCAGGTGAATATTCTGACGATGTTCCAGAAGGACAAGTA 
GTTTCTTTTACACCGTCGTCAGGCACGCAGCTTGATGTTGGTGAAACCGTGCAGATCCAT 
TTGAGCCGAGGCCCCGCCCCGGTTGAGATTCCTGATGTCTCTGGCATGGGAGTGGATCAG 
GCAACACGTGTGTTGGAGCGCGCAGGTTTGAGCGTCGAGCGTACTGAAGAAGGCTTTGAT 
GCTGAGACACCAAATGGTGATGTCTACGGGACTTCGCCCAAGGTATCTACTGAGGTCAAG 
CGCGGAACCTCTGTTGTGCTGCAGGTGTCCAATGCTATTTCGGTACCGGATGTGGTGGGT 
ATGACCAAGGACGAAGCCACCGCGGCGCTTGCGGAAGAAGGATTGGTCGTGGCGTCGACA 
AGCATTATTCCTGGTGAGGCGGCGAGCTCCGCTGACGCCGTCGTGACCGTCGAGCCTGAA 
TCCGGCAGCCGCGTTGATCCAGCGCATCCGCAGGTCAGCCTCGGGTTAGCTGGGGAGATT 
CAAGTTCCAAGCGTGGTTGGACGTAAGGTTAGCGATGCTCGAAGCATTCTGGAAGAAGCC 
GGTTTAACGCTGACAACTGATGCGGACGACAACGATCGAATTTATAGTCAAACCCCTCGT 
GCACGCAGCGAAGTCTCGGTAGGGGGAGAAGTTACAGTAAGGGCGTTT 

>RXA02 699-clownstream 
TAGTGGTTCCCTCGTTGCAGCAA 

>RXA02 724 -upstream 

CGAGCGCGCACGCGGTGGCCACCGCCAAGCTCATTGCGGATCCGGACTCCGATTTCATGG 
CTGATCTGGAGGAAGCGCGGCGCGTGGATAATATGGGTGC 
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>RXA02724 

ATGCTGATCGGTGAGGTGTCCAAGCTCAGTGGGGTGAGTGCGCGCATGTTGCGGCACTAC 
GAAAAGCTGGGTTTGGTCGAGCCGAAGCAGTCGACGGCGGGGTATCGGGAATACTCAGAA 
GGCGATGTGCGCAGGATTTTCCATATCGAAGGTTTGCGCAGCTTGGGTCTTAGTTTGAAG 
CAGGTTGGAGACGCGCTTGAGGATCCAGACTTTGATCCTCAGGCAGTCATTTCCGAGATG 
ATTGCTGAGACTTCTGCGCGGATTTCTATGGAACGGGAGTTGCTTGCCAGGCTGAAAGCG 
GTGCGTCATGCGCAGGCCTCGGATTGGGAATCGGCGTTGGATGCGGTGCAGATTTTACGT 
CGCCTGCGATCGGGGGATCCGGCGCAACGTCAAGCCGTGGCCTATGACTCTGTCTCTGGT 
AAAGAAGCAGTTGCGCTAGAAACCTTGGTGGAATCGGCGCTCGGTGAGTCGCATTTGAAC 
GCGGAGGGGGCGCTGTCGTGGGCGGTTGTGCAGCGTGGTGAGGAAGCTGTTGCATTGGCG 
GCACGAGGTTTGCGCTCAAGGGATGCGGCGGTGCGGCTGCGGGCTGTAAGGATTGTGGCG 
AGCGCGCCGAGTGCTGTTGCGGATCGAGTAGAGTGGCTACGGCCAATGATTCGCGATCCC 
GATGCTCTCGTGCGTGCTGAAACTGCGTTGGCGTTGGGAAAATCAGGCGATGAGAGTGCA 
GTTGAGCAGCTCGTGTCCATGGTTCTCACCGGTCTTCGGGACGTGGAGGCAGCCGAATTG 
CTTGCCGGATTTGGGGAGCCCGTGCAGTTAGATGTGTTCAAGAAATTTGCGCGGACGCTG 
GATGATGAGGAAACAATGTCCCCGACG 

>RXA027 4 7 -upstream 

CACCGGCAAAGTCGGCGACGGCAAAGTGTGGATGACTAACATCGAAGAGCTGGTTCGTGT 
TCGTACCGGTGAGCGCGGCGAAGCAGCCCTTTAAAAACTT 

>RXA027 4 7 

ATGAATAATCCAGCCCAGCTGCGCCAAGATACTGAAAAGGAAGTCCTGGCGTTGCTGGGC 
TCTTTGGTTTTACCCGCCGGCACCGCGCTTGCCGCCACCGGATCTTTGGCCAGGTCCGAA 
CTCACGCCGTATTCCGATTTGGACCTCATTTTGATCCATCCACCAGGAGCCACCCCGGAT 
GGCGTGGAGGATTTGTGGTACCCGATTTGGGACGCAAAAAAGCGTCTCGACTACTCCGTG 
CGCACCCCAGATGAGTGTGTGGCTATGATTTCTGCGGATTCCACTGCAGCCCTTGCCATG 
CTTGACCTGCGGTTTGTCGCTGGCGATGAGGATCTGTGTGCCAAAACGCGCCGGCGCATC 
GTGGAGAAGTGGCGCCAGGAACTCAACAAAAACTTCGATGCCGTTGTGGACACCGCGATT 
GCCCGTTGGCGCCGCTCCGGACCCGTCGTGGCAATGACGCGGCCAGATCTTAAACACGGC 
AGGGGAGGGCTGCGCGATTTCGAACTGATCAAGGCCCTCGCGCTCGGCCACCTATGCAAC 
CTTCCACAGCTTGATGCGCAACACCAGCTGCTTCTCGACGCCCGCACCTTGCTGCACGTC 
CACGCGCGACGCTCCCGCGACGTCCTTGACCCCGAATTTGCGGTGGATGTGGCCATGGAT 
TTGGGCTTTGTTGACCGCTATCACCTGGGCCGGGAGATCGCCGATGCAGCCCGCGCCATT 
GATGATGGCCTGACCACCGCGCTGGCCACCGCCCGTGGCATTTTGCCACGTCGCACAGGT 
TTTGCATTCAGGAATGCTTCTCGACGCCCACTTGATCTTGATGTCGTCGACGCCAACGGC 
ACCATCGAATTGTCCAAAAAACCAGATCTTAATGATCCCGCACTTCCACTTCGAGTGGCC 
GCAGCCGCAGCAACCACCGGACTTCCGGTGGCAGAATCAACCTGGGTTCGACTTAATGAA 
TGCCCGCCACTTCCTGAGCCATGGCCTGCCAATGCAGCAGGGGACTTCTTTCGGATTCTC 
TCCAGTCCGAAAAACTCACGCCGAGTGGTGAAAAATATGGATCGCCACGGATTGTGGTCG 
CGTTTTGTTCCAGAATGGGACCGCATCAAAGGGCTTATGCCCCGTGAACCCAGCCATATT 
TCCACCATCGATGAACATAGTCTGAACACTGTTGCAGGATGTGCGCTAGAAACTGTGACC 
GTCGCGCGCCCCGATCTTTTAGTTTTGGGAGCCTTGTACCACGACATTGGCAAGGGCTTC 
CCGCGTCCACACGAACAAGTAGGTGCAGAGATGGTGGCGAGGGCTGCAAGCCGCATGGGA 
TTGAACCTTCGCGATCGTGCCAGCGTGCAAACGCTGGTCGCCGAGCACACCGCGGTGGCC 
AAAATCGCCGCGCGCCTTGATCCCTCCTCGGAGGGCGCCGTCGATAAGCTGCTTGATGCT 
GTTAGGTATGACCTGGTGACATTGAATCTGCTTGAGGTGCTAACAGAAGCTGATGCGAAA 
GCCACGGGGCCTGGCGTGTGGACGGCGCGTTTGGAGCATGCGCTGCGGATTGTGTGCAAG 
CGTGCGCGTGATCGCCTCACCGATATTCGCCCGGTTGCGCCGATGATTGCGCCACGTAGT 
GAAATTGGTTTGGTGGAACGCGATGGCGTGTTCACAGTGCAATGGCACGGCGAAGACTTA 
CATCGGATTCTTGGCGTAATTTATGCCAAAGGATGGACAATCACCGCGGCGCGCATGCTG 
GCCAATGGTCAATGGAGTGCGGAATTTGATGTCCGCGCAAACGGCCCCCAAGATTTTGAT 
CCGCAGCATTTCCTGCAGGCATATCAATCCGGTGTGTTTTCCGAGGTTCCCATTCCAGCA 
CTTGGGATAACAGCCACATTTTGGCACGGGAACACTTTAGAAGTGCGCACTGAGCTTCGC 
ACAGGAGCTATTTTTGCCCTGCTCAGAACATTGCCCGATGCCCTCTGGATCAACGCTGTG 
ACCCGCGGTGCGACCCTGATTATCCAGGCAGCACTGAAGCCCGGCTTCGATCGAGCAACG 
GTGGAACGCTCCGTAGTCAGGTCGTTGGCAGGTAGC 

>RXA02 7 4 7 -downs t ream 
TGACGTGACCTGAGCGGGGGCAA 
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>RXA027 60-upstream 

TAGCATGGGACGAAAGCTGTTAGATAGCATGTTGCATCCCTGCGTTGGCTGATTATCGCT 
GGGTTTTAGGGTCGATAGATAGGTTGGGAGAACACGCATT 

>RXA027 60 

ATGAGCGATGAGAACATTAACGAGTTTGAGCAGGACGAGGATCTGAACTTCGGCGCGAGC 
TTTAGTGATGAATTCGCAGATGACGATTTCGATGCAGAAGCAGACGTAGAAGCAGATGCT 
GCTGCAGAGGCCTCTGCCCTGGAAGCTGAGCAGGATCTGGAAGAAGAGACCCTAGATGCT 
CCAGAAGAAGCCGCAGAAGAAGCTCCTGCTGCTGCAGAGTCCGAAGCTCCAGTAGAAGAG 
GACGAAGAGGCTGACAGCCTTGCTCAGGCGGCTGCTGCACTTGGTGACACCGATGAGCAG 
GACGCGGATGCAGAGTACAAGGCTCGTCTGCGTAAGTTCACTCGTGAGCTGAAGAAGCAG 
CCTGGTGTTTGGTACATCATTCAGTGCTACTCCGGCTACGAGAACAAGGTGAAGGCGAAC 
CTTGACATGCGTGCTCAGACCCTTGAGGTTGAGGATGACATCTTTGAGGTTGTTGTTCCT 
ATCGAGCAGGTCACTGAGATCCGTGATGGTAAGCGCAAGCTGGTTAAGCGTAAGTTGCTG 
CCGGGCTACGTTTTGGTCCGCATGGACATGAATGACCGCGTGTGGTCTGTTGTTCGCGAT 
ACACCTGGTGTGACCAGCTTTGTGGGTAACGAGGGCAATGCAACTCCTGTGAAGCACCGC 
GATGTTGCGAAGTTCTTGATGCCTCAGGAGCAGGCTGTTGTCACCGGTGAGGCTGCTGCT 
GCGGCTGCCGAGGGTGAGCAGGTTGTGGCTATGCCTACCGATACCAAGAAGCCTCAGGTT 
GCTGTGGACTTCACTGTTGGTGAGGCTGTGACCATTCTGACTGGTGCTTTCGCTTCTGTT 
TCTGCAACGATTTCTTCTATCGATCCTGAGCTGCAGAAGCTGGAAGTTTTGGTGTCCATC 
TTTGGTCGTGAAACTCCTGTTGATCTCAGCTTCGACCAGGTTGAGAAGGTTAGC 

>RXA027 60-downstream 
TAGTAGCTAAACTGCACCACTTA 

>RXA027 63-upstream 

CGTGTTTAAATCTAGAAGTTTAAAGGGTGAAAACAGTCCATTACTTAAGCACCAATCTGC 
CATAATTTTTACCCCAACGCATAGGCTTAACGGTGTGAAT 

>RXA027 63 

GTGAAGTTAACTGACGCCGCCCGTGAAGCTGGAGTAGGTTACGGTACTGCTTCTCGCGCA 
ATTTCTGGACGAGGTTCCGTTGATGCAGCAACCCGTGACAAAGTACTCGCCGCCGCCGAG 
AAACTTGGGTACCGAACCAACGCCATGGCTCGTGCACTTAGGGAAAACAAGACCCGCACC 
GTTGGCCTGATCGTTCCCGGCATTATCAATAAGTTCTACACCGAATCCGCCACTGTCCTC 
CAAGATGAATTAGACAAATCCGGATACCAACTAGTTGTTTCCACAACTGGAAACGACGCA 
GAAAAGGAACGTCGAGCTATCGAATCCATGCTCAACCGCCAGGTAGATGCAGTGGTGCAC 
GCTCCAGTTAATCCCCAAGCGAAGTTTCCAAAGGGCTTCAAAGTGGTCGAGCTTAATCGT 
CGTAGCGATCTCAACCGACCTACTGTGACCAGCGATGATGCCACTGGTTTGAAGGAACTT 
GCTCTTCATATTTTGGATCAGGGATACCGAGATATAGGTATCATTGTCGGTCCTGCTGAG 
CTCAGCACCGCCCGAGACCGCAAAGCCGGATTCATCAACGCCCTCGAAACCGAAGCCACA 
CAACGCGGAATCCGCGAAGAACTACGATTCCGGGTAGTTCACTCCCGCTACTCCCCCACC 
GGCGGTTATGAAGCATTCGCAG7VATTCCGCAATGATCTCCCTCAAATCGTGGTGCCCCTG 
AGCACGCAATTAACTCTAGGAGTTCTCAAAGCAACCCAAGAAAACGGCATAAAAATATCG 
GATGACCTGTCACTTGCTTGTTACGGCGTCGCCGAATGGCTCGCAGTGTGGGGCCCTGGC 
ATCACCGTTTTCGCACCAGACCTCCCAGCCATGGGCGCCGCAGCTGCCACGCAGGTTTTA 
ACGCTTCTCGACGCCGCCCCACTCCCCGAAAACCACTTAAGCATTCCGGGGCAGCTCATT 
GTGCGTGGGACAACTCCAAAGGTT 

>RXA027 63-downstream 
TAAAGGTAGAGGCGCACAATAAT 

>RXA027 8 7-upstream 

CACGATTCGCAGACGACGTGGTGCGTTTGTTCCTCAAAAATGACTAGCTGAAGAGCACCG 
GTTGCTTGGTTGTTTCAGCCAAGCGACTAACCTGTGCACT 

>RXA02787 

ATGGCCCAGGATTCACTTTTTGAAACGCCCGAAACACCGGGATCCGCAGGCAACACAAGC 
AGCGTGAGCAATTCCAAAGCCGCCTCGAAGTATTTTCACCCAGGCGGACACGCACCCCTT 
GCTGCCCGCATGAGGCCAAGGACGCTTGATGAAGTGGTTGGCCAACAGCATTTGCTGGGG 
GAGGGCAGGCCACTTCGCCGGCTCATTGAAGGTTCAGGGGATGCCTCCGTCATTTTGTAT 
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GGGCCTCCCGGCACTGGAAAAACAACCATTGCCTCATTGATTTCTGCAGCTGCAGGCGAT 
CGCTTTGTGGCGATGTCGGCGCTGTCCTCAGGTGTGAAAGAAGTCCGCGCCGTTATTGAA 
CGTGCGAGGATGGATCTGCAATTAGGGCAGCGCACCGTGCTGTTTATTGATGAGGTTCAT 
AGGTTTTCCAAAACTCAGCAGGACGCGTTGCTCTCTGCAGTGGAAAACCGCACCGTGTTG 
CTCGTTGCAGCGACCACTGAGAACCCCTCCTTTTCAGTGGTGTCTCCACTGCTGTCCAGG 
TCCTTGCTGCTCCAGTTGGAATCTTTAAGCGATGAGGATATTAAAACAGTCCTTAATAAA 
GCGCTTGAAGATGAGCGTGGACTTGCCGGTCGAATCACCGCCACCGATGAAGCAGTTGAC 
CAGTTGGTTCTTCTTGCCGGTGGCGATGCCCGCCGAGGCCTGACCTACATTGAAGCCGCT 
GCAGAAGCCGTAGAAGATGGCGGCGTTTTAGATATTGACACCGTCATGGCCAACGTGAAC 
CGCGCAGTGGTCCGCTATGACCGCGATGGCGATCAGCACTATGACGTGGTCAGTGCCTGG 
ATCAAATCAATTAGAGGCTCCGATGTAGACGCAGCCTTGCACTACTTGGCGCGCATGATT 
GATGCCGGTGAAGACCCACGGTTTATTGCCCGCCGGTTGGTGGTTCACTCAAGTGAAGAC 
ATCGGTATGGCTGATCCTTCGGCCATGCAAGTGGCCATTGCTGCAGCTCAAGCTGTCCAA 
TTAATCGGTATGCCAGAGGCGCGGATCAATTTGGCGCAAGCGACCATTCATTTGGCTCTT 
GCTCCC/^AATCCAATGCTGTCATCATGGCCATGGATGCTGCTTTGACTGATGTTCAGCAA 
GGCCACATCGGTACCGTTCCTGCGCATCTTCGCGATGGTCACTATGAAGGCGCCAAAAAG 
CTCGGAAATGCAGTGGGATATTCCTATCCTCACGATGATCCCAGGGGAGTGGTCCGGCAA 
GAATATTTACCGGAGAACCTGCGCGATCGGGTCTATTACGAGCCCACCACACACGGTGGA 
GAGAAGCGGATTGCCGAGTACATTGGCAGGCTTCGTCGTATAATCCGTGGAACCAAG 

>RXA027 8 7 -downstream 
TAGCCCGGGTTGCTCAACACCTA 

>RXA028 30 

CTGGAAGACAGCCTTGGCGTGTCGCTGTTTGAACGGGCCGGGCGCGGGCTGGCGCTGACA 
GGGGCGGGCGATCAGCTTTTGTCGCAGGCGCGCCGCCTGATCGCCCTGAACGACGAGGTA 
TACGCCCGCTTGAACGCCGGTGCCTACGAGGGCGAGGTGACGCTGGGCGTGCCTCAAGAC 
GTGATCTACCCCGTCATCCCGCGCGTCTTGCAGCAATTCGCCCGCGATTTTCCCCGCGTG 
CAAATTCACCTGATCTCGAACTTCACGCTGATGCTGAAAGAACAGTTCCGCCGCGGCGAA 
ATCGACGTGATGCTGACGACCGAGGACGAGCTGGGCGAGGGCGGCGAGACGCTGGCCCAG 
CGCGAGCTGATCTGGGTCGGCGCACCGGGCGGGTCGGCGTGGACCCGCAGGCCGCTGCCC 
TTGGCGTTTGAACGCGCCTGCATTTTCCGGTCTTTCTTGCAGCGCCGCTTGGATGCCAAC 
AGCATCTATTGGCAA 



>RXA028 31-upstream 

CGAACTGGCGCGTGTCTTGTCCGACGCAGCCTAGCCCGCCTTTATTGACCCCTCCGGCAC 
GCTTCGATAGGGCTAGGAAAACCCTGTCGGAGGAGCGCCC 

>RXA028 31 

ATGACACATCGGATCACACCCGAACTCTCGGCCGAATTGCGGGGGGTGGCCCACAGCCTT 
GCAGATGCGGCGCGGCCCGTCACCTTGCAATACTTCCGCACAGCAGTCGCGGCAGATAAC 
AAAGGCGCGCTGCGCGGGATGGCTTACGACCCCGTCACCATTGCCGACCGTGCAAGCGAA 
CAGGCCATGCGTGACATTCTGGCCCGTCTACGCCCCGATGATGCGATCTTGGGTGAAGAA 
TTCGGCCCCAAAGCGGGCACAACGGGCCTCACATGGGTGCTGGACCCGATTGACGGCACT 
CGCGCATATATCGCGGGCGCGCCCACTTGGGGCGTGCTGATCGCAGTATCGGATGATCAG 
GGCCCGCTGTTCGGTATCGTCGACCAACCCTATATTGGCGAGCGTTTT 

>RXA0288 0-upstream 

GTCGCAAAAGTCGCTGGCGTCTCCCCTTCCACTGTGTCGCGGGCGTTTTCGCAGCCTGGG 
CGAGTGAGTTTTTCCACTGCGGAGAAAATCCGCAACGCGT 

>RXA02880 

GTGGAAACTCAGGCTTTTCAGCGCCAAAACACCGGCCTCATCGCTATGGTTGCCGCCGAT 
GCGTCGAATCCCTTCTTCTTGGAAATTTTCCGGGGCGCGCAGCACGCCGCAAGCACTCAG 
GGCTATACGGTTGCGCTTGTCGACGCCCGGGAGTCGGCGATTAAGTCCAGGGAGGTGCTG 
GACAAGATCGTCCCCCACGCCGATGGCTTATTGCTCGCTGCTTCAAGGATGGATTCTGGT 
GAGATCCACAAAGTCGCGCGGGAAATTCCCACTGTATTAATGAGCCGTGAAGTGCAAGGT 
ATTCCCAGCGTGATGGTGGATAACTACGACGGTGCGCCGAAGGCTGTGGTGCATTTGGTG 
GATCAGGGGTGCCGCTCCATTACCTATATCGCCGGTCCTAATAAATCCTGGGCT 
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>RXNO 0031 -upstream 

CACACATCGCCTCGTCATCCTTAGACACGCCAAATCTTCCTGGTCCACCGGAGTACTCGA 
CCATAAACGCCCACTTAATCAACGTGGGCTTCGCGATGGC 

>RXN00031 

GTGGCAGCTGGCCAATGGCTAGCTGGCAACATCGGCGAAATTGATCATGTGCTGTGTTCA 
GATGCCACCCGCACACAATTAACGTGGGAACGCGTCCAGCTTGGTGGCGCAACCGCCAAA 
GGCTCTAGCTTCCACAATGACATCTATGAAAACCAAGTGTCTGAATTTAAACATTTAATA 
ACAGGGCTCCCAGATGTAGTTGGTACCGCCCTACTCATCGGGCACTGGCCAGGCGTGGAA 
GAACTAGCCCATTATTTTGGCATCCGCGATGAACATCCCGGTTGGGATCAGATGGAAGAA 
AAGTTTCCCACCAGCGCCATTGCGGTGTTGGT^TTTAACACCCCTTGGTCAAAACTTGAG 
AGAAACTCTGCTCGGTTGACAGATTTTGTCATTCCACGGGGT 

>RXN000 31 -downstream 
TAGTTCTGCTTCAATTGAACAAT 

>RXN00035-upstream 

GGTATTACCCGAAAGTAATGTCGTTAATACTGTTTTTAATGGCTATAAAGAGGCATAGGG 
TTAGTTATATGAGTAACCAACCATCGGGATCGTCGCGACC 

>RXN00035 

GTGCCTCTGTATAAACAGATCGCTTCTTTGATTGAGGACTCCATCGTTGACGGAACCTTG 
AGCATTGATCAACGCGTGCCTTCTACTAATGAACTAGCCGCGTTCCATCGCATTAATCCC 
GCCACCGCACGCAACGGCCTGACCCTCCTTGTCGAAGCCGGCATCCTCTATAAGAAGCGT 
GGCATTGGCATGTTCGTCAGCGCCCAGGCCCCAGCACTCATCCGAGAGCGGCGAGATGCC 
GCCTTCGCGGCTACTTATGTAGCACCGCTTATCGACGAATCCATCCACCTTGGTTTCACT 
CGTGCGCGCATTCACGCCCTTTTAGACCAGGTCGCTGAAAGTAGGGGCCTGTACAAG 

>RXN00035-downstream 
TAGCGCTTAAACCCTCTTGACCT 

>RXN0004 9-upstream 

CTGAATCATGATTCATAAATGAACAAGGGTTCAGATTTTACAATACCCCCATTCCACCCC 
CTTATATTTAAGTCACCGCAGATCAGCTAAGGTTTTCCCT 

>RXN00049 

ATGCCCACGCCTTCGCAGCACAAGGACGCTTCAACAGCACAAACCGACAACCAGGTACCA 
ACTGGCCGCCGTGCAC7\AAAACGCGAACAAACCCGCGCGCGCCTGATCACTTCCGCTCGC 
ACACTCATGGCAGAACGGGGTGTCGACAATGTAGGAATAGCTGAAATCACCGAAGGCGCA 
AACATCGGAACGGGAACCTTCTACAACTACTTCCCAGACCGTGAACAACTACTCCAAGCT 
GTCGCAGAAGATGCCTTTGAATCCGTGGGAATTGCCCTCGACCAGGTGCTAACCAAATTA 
GACGATCCGGCTGAAGTATTTGCAGGGTCGCTTCGACATCTAGTACGGCACTCGTTAGAA 
GATCGGATTTGGGGCGGATTTTTCATACAAATGGGTGCTGCTCATCCCGTACTCATGCGC 
ATCCTAGGACCCCGCGCACGCCGAGATCTACTTCATGGTTTAGAAACTGGCCGATTCACC 
ATCGAAGATCTGGACCTAGCAACCACATGCACTTTTGGTTCACTCATCGCAGCGATCCAA 
ATGGCGCTTTCTGCAGATCAAGATTCCAACGATGACAAAGATCAGATTTTCGCAGCCGCG 
ATGCTCCGGATGGTGGGTGTTCAAGCAGCAG/U^GCCCGGGAGATCGCTTCGCGTCCACTC 
CCCGAAATATCCCCAGTCAAACCGCAG 

>RXN000 4 9-downstream 
TAGTGATCGGGCCTCAAATAAAC 

>RXNO 02 91 -upstream 
TGGTGATTCAA 

>RXN00291 

GTGGCAACCGTCGCGTTGGTGGTAGCTATTTGCACCGGAATTTTCGCAGTTTTGATGATG 
GATCAGATGAAAACTGAGGCCGAGCACACAGCGCTGTCCATCGGACGTTGGGTGGCATCC 
AACCCGCAGATCCGCGAGGAAGTAGCGCTTGATACTCAAACAGGAGCAAACCCATCGGCC 
GAAGAATTAGCCGATGGAGATATCCAAGCGGTTGCACAGGCGGCCAATGAACGCACTGGA 
GCTTTGTTTGTCGTTATCACTGACGGTTTAGGTATCCGCCTGTCCCACCCAGATGAGGAA 
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CGTCTGGGGGAGCAGGTGAGCACTAGCTTTGAGGCTGCCATGCGGGGTGAAGAAACCATG 
GCGTGGGAGACTGGGACCCTCGGTGCGTCCGCGCGAGCAAAAGTGCCTATCTTTGCGCCG 
GATTCTAGTGTTCCAGTCGGTGAGGTCAGTGTTGGGTTTGAGCGAGACAGTGTGTATTCC 
CGCCTGCCCATGTTCCTCGCCGCCCTTGCTCTTATTTCTGTGTTGGGAATCCTTATCGGC 
GTGGGTGTAGCCATGGGCATGCGACGCCGTTGGGAACGCGTGACCTTGGGTTTGCAGCCG 
GAGGAGCTAGTGACCCTTGTGCAAAATCAGACTGCAGTCATCGATGGCATTGATGAGGGC 
GTGCTGGCGCTGAGCCCAAACGGAACAATTGGGGTGCATAATGAGCAGGCGCAATCCATG 
ATTGGTGCAGGTCCTATGAGTGGCAGGACGTTGAAAGAACTAGGGCTTGACCTGGGTCTT 
GATGGCGTTGTATTGCATGGTCAGCATCCGGAAACCGTTGCCCATAACGGCAGGATCCTC 
TATCTGGATTTCCACCCCGTGCGCCGTGGGGATCAAGATTTAGGCTACGTGGTAACCATC 
CGCGATCGTACCGACATCATTGAACTCAGTGAACGCCTCGACTCTGTGCGCACCATGACC 
CACGCACTCCGCGCCCAGCGCCACGAGTTTGCCAACCGCATCCACACCGCAACAGGGCTT 
ATCGACGCCGGCCGCGTCCACGACGCGGCAGAGTTTCTAGGCGATATATCCCGCAACGGG 
GGACAGTCACATCCATTGATCGGATCAGCGCACCTCAATGAAGCATTTTTGAGCTCATTT 
TTAAGTACTGCTTCTATTTCGGCATCTGAAAAGGGCGTTAGTCTGCGCATCAACTCTGAC 
ACGCTCATCCTTGGCACTGTTAAAGATCCAGAAGATGTAGCAACCATTTTGGGTAATTTA 
ATCAACAATGCCATCGACGCCGCGGTGGCAGGTGAAGCCCCACGGTGGATTGAGCTTACG 
TTGATGGATGATGCCGATACGCTGGTCATTTCTGTTGCAGATTCTGGTCCTGGAATCCCA 
GAGGGCGTGGATGTATTTGCCACAGCCACCCAGATAGGAGACTCTGAAGATAATGAACGC 
ACCCACGGGCATGGCATTGGTCTAAAACTGTGCCGGGCTTTGGCTAGATCACATGGTGGC 
GATGTCTGGGTGATTGATAGAGGAACCGAAGATGGCGCTGTATTTGGAGTGAAACTACCG 
GGAGTAATGGAG 

>RXN002 91-downstream 

T AAT GG AT C AAAC AC T T AAAG T T 

>RXN003 63-upstreaiti 

TGAATTTGATGGTGTGAGTCATGGTGGGTCCTTTTGTGAAATTCGATCCAAGCGGGCTTT 
GAGTAACATGTTACCGGTTACTGTGGTGAATTGTGCGATA 

>RXN00363 

ATGTCAGACATGCCAACAAAAAGGGTTGCCCCCGCACGCTCACTCACCGACCAAGTCATG 
GATTTCGTCCGCGAATCCACCCTTGATAAAACAATGGTCACCGGAGAGTGGTACAGCGTT 
TACCAGGTCAGCGACCAATTAGGCATTTCCCGCTCCCCCGTCAGAGACGCGCTGCTCCGC 
CTGGAAGAAGCAGGGCTCATCCGCTTCACCAGGAACCGCGGATTCCAAATTGTCGAAACC 
AAACCCTCTGATGTCGCCGAAATTTTTGCCCTTCGTCTAGGCATTGAACCCGCCGCAGCA 
TACCGGGCAGCACAGCTACGCACCGAAGAACAGCTCCACGAAGCAGATGACATCATTGCA 
CTCATGGCGCAAGCCGAGGCCGACAATGACGAAGAAGCATTTTTCACCCATGACCGGCAG 
TTTCACCGACAAATTATGACCATGGGACACTCCCAACGCGGGGCTGACCTGGTAGAAAAA 
CTACGCGCACACACCCGTATCCTCGGTGCTTCTACTGCCGGGAACAAACGCACCCTTGGC 
GATATTTTGGAAGAACACGAACCAATCTTGGATGCCATCAAACGACAATCAGCAGAAATG 
GCACGAGCCACCATGCGGGAGCATATCCAAGTCACCGGAAAGCTACTACTAGAACAAGCA 
GTGGAAAAATCCGGCGAAGGAGCTGCTCAGAAGATTTGGGATCAGTACACGGCGGGAGTT 

>RXN003 63-downstream 
TAG G CAT AT T T AC CTAATCAATT 

>RXN004 64-upstream 

GAATAAACCAAAGGGTGGACACTTATTCGAACGGGTGCTTGATTTAGGCAACTTGAGCAA 
AATTCTGCCATTTCGCCTCAAATCGGGCTAGTTTTGAAGC 

>RXN004 64 

ATGAGCGAACGTCAGCTGGAAAAGTCAATTGAGCACGCCGTCGAGTTAGCCCGCGAAGCC 
CGAAACATCGAAGTTTTTACCGGAGCCGGAATGAGCGCCGACTCCGGGTTGGAAACGTAT 
CGTGATGATAAAACCGGGCTGTGGAGCAACGTAGATCCACAAGCGATGGCAAGTATCGAT 
GCATGGCGCAAAGATCCAGAGCCAATGTGGGCGTGGTATCGCTGGCGCGCCGGGGTGGCA 
GCTAGGGCAGAACCCAACGCGGGGCATCAAGCTATTTCCTACTGGGAGGGGAGTGACACC 
GTCGAACACGTTCACATCACCACCCAGAACATTGACAACCTGCACGAGCGAGCTGGCTCT 
AGCGATGTGACACATCTTCATGGCAGCTTGTTTGAATACAGGTGCTCTGATTGTGCGACT 
CCATGGGAAGACGATAAAAACTATCCGCAAGAACCCATTGCACGCCTTGCTCCTCCACAA 
TGTGAAAAGTGCGGAGGGCTGATTAGACCAGGTGTGGTGTGGTTTGGTGAGAACCTGCCC 
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GTAGAAGAGTGGGATATTGCAGAGCAACGCATCGCAGAAGCCGATCTCATGATCATTGTG 
GGTACCTCCGGGATTGTTCATCCTGCAGCAGCACTCCCGCAATTAGCCCAACAACGCGGC 
GTTCCCATCGTGGAGATCTCCCCAACGCGCACCGAACTTAGCCGGATCGCAGACTTCACC 
TGGATGTCCACCGCAGCCCAAGCGCTACCAGCGTTGATGCGAGGTTTGAGCGCC 

>RXN004 64-downstream 

T AAC AT G AC T G AAG AT GAG T T AG 

>RXN004 67-upstream 

AGCTGATGGAAAAGAACTTCTCAACATTCTCCGCAACAGAAAACTACCCGCACTGGAATC 
AGCTCCGCTGGGACAAAACAGCCTGGACTAAGGTGTAATC 

>RXN004 67 

ATGCACATCTCAGATCTTCCCGATAGGTCCCAGGACTACCTGAAGACAATCTGGGACATC 
ACAGAACTCCTTGATGATCAACCAGCAGCACTCGGCGATATCGCCGAAAAAATGAACCAG 
AAAACTCCTACCGCCTCCGAAGCAATCAAAAAGCTGGCGGCAAGGGGCCTGGTCAACCAT 
GAAAAATATGCTGGTGTCACCCTCACTGAACAGGGCAAAACGCTAGCCATCGACATGGTG 
CGACGCCACCGCCTGCTGGAAACCTTCCTCCACGATGTTTTGGGATACACCTGGGACGAA 
GTCCACGCCGATGCAGACCTGTTGGAACATGCAGCCTCTGATCAGCTCATCGAACGCATC 
GATGCTCACTTGGGTCGTCCACGCAAAGATCCCCACGGCGATCCCATACCAACTGCCGAA 
GGCGTTATTGAAGAGTCTCCCCGAACCACCCTCGAGGCAGTTCAGCCAGGGGAGACTGTC 
ACGATTTCCAGGGTCAAAGACATTGATCCTGAATTGCTGCGCTACCTCGCGCAATACAAC 
GTCTCACCAGGATGCCGGATCACCGTTGCGTCCGGCCCACTAGCTGGCATGGTGCATGTC 
GTTGTAGAAGGCACCGACACCAGCTTCCCCCTGGCCGAAACGCAACTGCCATTAATTACA 
GTGCAGGAC 

>RXN004 67-downstream 

T AAG C AG AT T CAT C AT AAT G G T G 

>RXN004 8 6-upstream 

GTTTATTCATGCCCTTGATTATTGCCAAAGAAACCTTTAAGGACTAGATCGAAAAACAGC 
CAACTATAGTTAAGTAATACTGAACAATTTTGGAGGTGTC 

>RXN004 8 6 

GTGCTCAATCTCAACCGCTTACACATCCTGCAGGAATTCCACCGCCTGGGAACGATTACA 
GCAGTGGCGGAATCCATGAACTACAGCCGCTCTGCCATCTCCCAACAAATGGCGCTGCTG 
GAAAAAGAAATTGGTGTGAAACTCTTTGAAAAAAGCGGCCGAAACCTCTACTTCACAGAA 
CAAGGCGAAGTGTTGGCCTCAGAAACACATGCGATCATGGCAGCAGTCGACCATGCCCGC 
GCAGCCGTTCTAGATTCGCTGTCTGAAGTGTCCGGAACGCTGAAAGTCACCTCCTTCCAA 
TCCCTGCTGTTCACCCTTGCCCCGAAAGCCATCGCGCGCCTGACCGAGAAATACCCACAC 
CTGCAAGTAGAAATCTCCCAACTAGAAGTCACCGCAGCGCTCGAAGAACTCCGCGCCCGC 
CGCGTCGACGTCGCACTCGGCGAGGAATACCCCGTGGAAGTCCCCCTTGTTGAGGCCAGC 
ATTCACCGCGAAGTCCTCTTCGAAGACCCCATGCTGCTCGTCACCCCAGCAAGCGGCCCA 
TACTCTGGCCTCACCCTGCCAGAACTCCGCGACATCCCCATCGCCATCGATCCACCCGAC 
CTTCCCGCGGGCGAATGGGTCCATAGGCTCTGCCGGCGCGCCGGGTTTGAGCCCCGCGTG 
ACCTTTGAAACCAGCGATCCCATGCTCCAAGCACACCTCGTGCGTAGCGGCTTGGCCGTG 
ACATTTTCCCCCACACTGCTCACCCCGATGCTGGAAAGCGTGCACATCCAGCCGCTGCCC 
GGCAACCCCACGCGCACGCTCTACACCGCGGTCAGGGAAGGGCGCCAGGGGCATCCAGCC 
ATTAAAGCTTTTCGACGAGCCCTCGCCCATGTGGCCAAAGAATCTTATTTGGAGGCTCGT 
CTAGTAGAG 

>RXN004 8 6-downstream 

T GAG T T C T T G T G AGC C T T C AG AC 

>RXN00551-upstream 

ACTTCTCAAGTGACGCCAAGGTAAGTTGTACTTTTTCTGTCCAAATTATTGCTTTTTCCG 
TAGATAGGTTATCGAACGGAAATTACTTGGCAATACCGCT 

>RXN00551 

ATGCTGGCAGGCATGCCTAATTTAAACGCTGAGGAGCTAGCAGTCCGCGTGCGACCCGCG 
CTGACAAAACTCTACGTTCTCTATTTCCGCCGCTCTGTGAATTCTGACCTCTCGGGTCCA 
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CAGCTCACTATTTTGAGTCGCCTGGAAGAAAACGGCCCATCCCGAATTAGTCGCATCGCG 
G7VACTTGAAGATATTCGTATGCCAACCGCTTCGAATGCTCTGCATCAGCTGGAGCAACTC 
AACCTGGTTGAGCGTATCCGCGACACCAAAGACCGCCGAGGCGTGCAGGTTCAGCTCACT 
GATCATGGACGCGAAGAGCTTGAGCGCGTGAACAATGAACGAAACGCAGAGATGGCTCGA 
CTCCTTGAAATGCTCACCCCAGAGCAGCTGGAACGTACCGAAGACCTGGTGGATATCATT 
ACTGAGCTTGCAGAGGTGTACGGTAGCTGGAAAGAGACCGACAGCGGTTCT 

>RXN00551-downstream 
TAACAGTTTTCTCCATCTCAACT 

>RXN00617-upstream • .. - 

TACAGCGATGATGTGGTGGAGTTGGGTTCAGGATTTCAAGGGGTCTGGCCCCCACCGGAT 
TTTTCGCATTTGGGTTCGTGACAAAGGAAACGGTGTTGAT 

>RXN00617 - 

ATGGATGAACAAGAAGCCCTGTTCGATCGCTTCTCCAGAGGCTCCCAAAAAAATTCACGG 
CGTCCCGGTGGCGCTGGCCTGGGATTATCCATTGTCAAGGCGATCGGCGAAGCCCACGTC 
GGCCGAGCTTTCGTCAATTCCACACCAGGTCTAGGATCCATTTTCGGCCTGGAAATCCCC 
GCACCAGAACAATCAAAGGAATACACCCATGAGCAAGATCCTGCTCGC 

>RXN006 17 -downstream 
TGAAGATGACGCCGGCATCGCAG 

>RXA00 630 

GCAAAGATCCTGGATAACGTGTGGCACTACGATTTTGGTGGCGACGGCAACGTCGTGGAA 
TCCTACATCTCCTACCTGCGCCGCAAGGTGGACACCCAGGATCCGCAGCTAATTCAGACT 
GTTCGTGGCGTTGGATATGTTCTGCGCACCCCACGTAGC 

>RXA00 630-downstream 
TAAATTCTCCTATGGAAAATCCT 

>RXNO 0 631 -upstream 

CCTACCTGCGCCGCAAGGTGGACACCCAGGATCCGCAGCTAATTCAGACTGTTCGTGGCG 
TTGGATATGTTCTGCGCACCCCACGTAGCTAAATTCTCCT 

>RXN00631 

ATGGAAAATCCTTATGTTGCTGCGCTCGATGACGAAAACCAAGAAGTCGGCGTAAAAAAA 
GAAGCAGAAAAAGAACCTGAAATAGGTCCCATCAGAGCTGCCGGACGAGCCATACCGCTG 
CGCACCCGCATCATTTTGATCGTGGTGGGTATCGCCGGGCTTGGTTTGCTGGTCAACGCG 
ATTGCTGTCTCCAGCCTCATGCGTGAAGTTTCCTATACCCGCATGGATCAAGAGCTAGAG 
ACCTCGATGGGGACGTGGGCGCATAACGTTGAGCTGTTTAATTTCGATGGCGTCCGCCAA 
GGGCCACCCAGCGATTATTATGTGGCCAAGGTTTTTCCTGATGGATCCAGCATCATCTTC 
AACGATGCACAATCGGCACCCGATCTAGCTGAAACCACCATCGGTACTGGTCCACACACT 
GTGGATGCTGCTAGCGGTTCTGCCTCCAACACTCCGTGGCGTGTGATGGCGGAAAAGAAC 
GGTGACATTATCACCGTGGTGGGTAAAAGCATGGGGCGTGAAACAAACCTGCTGTACCGA 
TTGGTGATGGTGCAGATGATCATCGGCGCGCTGATTCTGGTTGCTATTTTGATTACTTCA 
CTCTTCCTAGTCAGACGCTCGTTGCGGCCGTTGAGAGAAGTTGAAGAGACCGCCACCAGG 
ATTGCGGGCGGTGATTTGGATCGACGTGTCCCGCAGTGGCCAATGACCACAGAAGTCGGA 
CAGCTGTCGAATGCCCTCAATATCATGTTGGAGCAGCTCCAAGCCTCAATTCTGACCGCC 
CAGCAAAAAGAAGCTCAGATGCGCCGATTCGTTGGCGATGCCTCCCACGAGCTCCGCACA 
CCACTGACCTCTGTGAAGGGCTTCACCGAGCTGTATTCATCAGGTGCAACAGATGATGCC 
AACTGGGTCATGTCCAAGATCGGTGGCGAAGCCCAACGCATGAGTGTGCTTGTGGAAGAC 
CTCCTGTCACTGACGCGTGCCGAAGGCCAGCAAATGGAGAAGCACCGCGTTGACGTGCTG 
GAACTCGCCTTGGCAGTACGCGGATCCATGCGAGCAGCCTGGCCAGATCGCACAGTCAAT 
GTATCCAACAAAGCTGAGTCCATTCCGGTTGTCAAAGGCGACCCAACTCGCCTCCACCAA 
GTGCTTACCAACCTGGTTGCCAACGGACTAAACCACGGCGGACCGGACGCGGAAGTCAGC 
ATTGAGATCAACACCGATGGACA/yy^CGTGAGGATTCTCGTGGCAGACAACGGTGTCGGA 
ATGTCTGAAGAAGATGCTCAGCATATCTTCGAGCGTTTCTACCGCGCCGATTCCTCCCGC 
TCACGCGCATCCGGCGGATCGGGCCTCGGCCTTGCGATCACGAAATCCCTGGTCGAAGGC 
CACGGCGGCACAGTCACCGTCGACAGCGTGCAAGGCGAAGGCACGGTGTTCACGATCACC 
TTGCCGGCGGTTTCT 
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>RXNOO 631 -downstream 
TAAAGGCATCAAGGGCCGGAAAA 

>RXN00 651-upstream 

GGCTGCCTCGGTGGTGGTCTCTGGGGTTGCTTCAGGTTCCGCCGGGGTACAAGCGGTGAG 
CATGATGGAAGCAGCGAGGATAGTAGGTAATGTACGACGC 

>RXN00651 

ATGCAGTCAAGCCTAGATCGTGTGTCGGAAACCGGACGCAATGAGCTCGATGTTGAAACC 
CTTGTGAAGAAGGGGAATCAACCGGGCGCGATGAGCTATCGCAACAGTATCCACATTTTG 
ACAGCCTCGCTGCTGGTCGTGGGGTTGGGAGCTTCCGCCCGCCTGACGCTGCCGATGTTT 
GCGCTGTCGTGCGTGCTGTTGTTTGTGTGGGGTTTTCTGTACTTCTATGGATCAACCAAA 
CGCGTAGATTTGAGCCACGGCATGCAGCTGGGCTGGCTGTTTGTGCTGACGCTGGTGTGG 
ATTTTTATGGTGCCGATCGTGCCCGTGTCCATTTATCTGCTGTTCCCGCTGTTTTTCCTC 
TATCTACAGGTGATGCCTGACGTGAGAGGCATTATTGCGATTTTGGGTGCGACAGCGATT 
GCGATTGCCAGCCAGTATTCCGTGGGGTTGACCTTTGGTGGTGTGATGGGTCCGGTGGTC 
TCTGCGATCGTGACCGTGGCTATTGATTACGCGTTCCGCACGTTGTGGCGGGTGAATAAT 
GAAAAGCAGGAATTGATTGATCAGTTGATTGAAACTCGCTCCCAGCTGGCGGTGACGGAA 
CGAAATGCGGGTATTGCTGCGGAACGTCAACGTATTGCGCATGAAATTCATGACACGGTC 
GCCCAGGGACTCTCCTCCATTCAAATGCTGCTGCATGTCTCTGAACAGGAGATTCTCGTT 
GCTGAGATGGAAGAGAAGCCAAAGGAGGCGATCGTGAAGAAGATGCGCCTTGCCCGACAA 
ACAGCCTCCGACAATCTCAGTGAGGCTCGCGCGATGATTGCGGCGTTGCAACCGGCAGCG 
CTGTCTAAAACCTCCTTGGAAGCAGCACTTCACCGCGTCACAGAACCGTTGTTGGGTATT 
AATTTTGTGATTTCTGTCGACGGTGATGTTCGCCAACTGCCCATGAAAACTGAAGCCACC 
CTTCTGCGAATTGCTCAAGGTGCGATCGGAAATGTGGCGAAACATTCAGAGGCGAAAAAC 
TGCCACGTGACACTAACCTACGAAGACACAGAAGTACGCCTTGATGTGGTTGATGACGGT 
GTGGGTTTTGAGCCTTCGGAAGTGTCCAGTACCCCCGCTGGCCTTGGCCATATCGGCTTA 
ACCGCATTGCAGCAGCGTGCGATGGAATTGCACGGCGAAGTTATAGTGGAATCTGCATAT 
GGGCAGGGTACTGCGGTATCTGCAGCATTGCCGGTGGAGCCACCAGAGGGGTTTGTCGGG 
GCGCCGGTTTTGGCAGATTCGGACTCAAGTGCTACAGGCGAGGTTGAACTAAGTTCTCCA 
ACTGACGATGAG 

>RXN00 651-downstream 
TAAGGCTAGACTAAAGTACGATT 

>RXN00822-upstream 

CAGTTAGGTGTCATCCGGATTTTATCTCAAACCCTAACACCCCAGGTGTTGCCACTCATC 
CGGACTCAAACAAGATGTGTGCAGATGAAGGAGAAAAGCA 

>RXN00822 

GTGGAAGGTGTACAGGAGATCCTGTCGCGCGCCGGAATTTTTCAAGGCGTTGACCCAACG 
GCAGTCAATAACCTCATCCAGGATATGGAGACCGTTCGCTTCCCACGCGGAGCAACCATC 
TTCGACGAGGGCGAGCCAGGTGACCGCCTTTACATCATCACCTCCGGCAAAGTGAAGCTT 
GCGCGCCACGCACCGGACGGCCGCGAAAACCTGCTGACCATCATGGGTCCTTCCGACATG 
TTCGGTGAGCTCTCCATCTTCGACCCAGGCCCACGCACCTCCTCTGCAGTGTGTGTCACC 
GAAGTTCATGCAGCAACCATGAACTCTGACATGCTGCGCAACTGGGTAGCTGACCACCCA 
GCTATCGCTGAGCAGCTCCTGCGCGTTCTGGCTCGTCGTCTGCGTCGCACCAACGCTTCC 
CTGGCTGACCTCATCTTCACCGACGTCCCAGGCCGCGTTGCTAAGACCCTTCTGCAGCTG 
GCTAACCGCTTCGGCACCCAAGAAGCTGGCGCGCTGCGCGTGAACCACGACCTCACTCAG 
GAAGAAATCGCACAGCTCGTCGGTGCTTCCCGTGAAACTGTGAATAAGGCTCTTGCAACG 
TTCGCACACCGTGGCTGGATCCGCCTCGAGGGCAAGTCCGTCCTCATTGTGGACACCGAG 
CATTTGGCACGTCGCGCTCGA 

>RXN008 22-downstream 

T AAT C AC C AAAGC G C T AAAAAG C 

>RXN0082 6-upstream 

TCGGCGCGCGCGATCTGGTGCTCATCTTGGTGTGTGCTGCCATTTCCGCGATCGCTCTAA 
CCGTGTCCATTCAGACTGGTTTCTTTAAGTTCTTGGGCAC 



Appendix A, page 4 0 




|omey Docket No.: BGI-123CP 

>RXN00826 

ATGATCACAGTTTTAATTGATGGACAATCCGGTGCGGGCAAAACCACCTTGGCGGGTGAG 
TTAGCTGCCCGCACCGGGTTTCAGTTGGTTCATTTGGATGACTTTTATCCTGGTTGGACT 
GGCCTTGAAGCGGCATCGGAGATTGTTGCACGCCATGTTTTGGACGCGGACAACCCCGGT 
TTCTTCACGTGGGATTGGCACAACAATTGCCAAGGCGATTGGATCAAGTTGGAGCCTGGT 
CG7VAGTCTCATTATCGAAGGCTCTGGATCAATCACTGCTGCAACAAAACGCAAGGCATCG 
CTGTTGGGCGAGCTGGTGACCGTTCGTATCACTGGTCCTGAGGCTTTAAGAAAACAGCGC 
GCCCTCAACCGCGATCCTGATTACGCACCATTTTGGAAAGTGTGGGCGCAGCAGGAGCAA 
CGCCATTTCTCTTTAGGCGTTGAGGTGGATCATGAGATTGTGCTAGGTTCTGATGAGGCT 
TCGGGACGACCCGAAGAAATCTATGACAGCCTGGGAACGGCCCAGAGTTCT 

>RXN008 2 6-downstream 

T AAG AAAG T T T G AC T AG AG AAC A 

>RXA008 4 8 

ACCACTGTCACCTTGGCTAAGGCTCGGTCTCTCTCCTTGGATGAGGCACTGGAGTTCTGT 
GGCGTCGACGAGTGCGTCGAGGTTACCCCTGATGTTCTGCGCATCCGCAAGGTCATCCTG 
AACGCTACTGAGCGTGGCCGTGCACGTTCCCGTGCGAAGAGCCTGAACAAG 

>RXA0084 8-clownstream 
TAATTCTCTTTTAGTTAAGAGTT 

>RXN008 4 9-upstream 

GCAAAGCTTTCGCCTGCTGATTGACCATATTGAGTCGCAGTGACTCAAGTTTCCAGGTAA 
ACTGGGAACAAATTTTAGGGAAAGGGAGTTGAACCTAACG 

>RXN00849 

ATGGTTACTTATACAACCCTTCTAGACAAGCCGATTTCAGAATCTGCCCCACGGAAAGCT 
CCAGAGCCACTTCTCCGCGAAGCTCTGGGTGCAGCTCTTCGTTCTTTCCGTGCTGACAAG 
GGCGTTACTTTGCGTGAGCTGGCGGAAGCTTCACGTGTGTCACCTGGTTATCTTTCAGAA 
TTGGAACGCGGCCGCAAAGAGGTGTCCTCTGAGCTTCTTGCCTCCGTGTGCCACGCTTTG 
GGGGCCAGCGTTGCGGATGTGTTGATCGAAGCTGCAGGTTCCATGGCGCTGCAAGCAGCG 
CAGGAAGACCTCGCTCGCGTC 

>RXN008 4 9-downstream 
TAAGCGCATGGGTGGGCGTCGAA 

>RXNOO 97 8 -upstream 

TCCTGGGTAGCATGGGCTTATGAGCACTGATAGCCAAAACCCTGTAAGAAAATCCTGCGC 
ACAGCCACATTCTTGTTCCCAAGAGGTGCGATTGAAAGCG 

>RXN00978 

ATGTCCAGGTCACCGCTTACTAAAGGTCTAAATCAACTTGAACACCTCGAGTTAGATAAG 
TCACTAACTGCGTGGTCGTGGGCAGAAGATGATCCTTTGTACCTCGCAGGTGAGAACTTA 
AACGGCAGTTACCTCATTGTCGCAGGACGAGTGCGGGTCTCTCGCGACACCATCGACGGG 
AAAGAACTCACCGTTGATATTGCAACGCCCGGCGATGTTATTGGTGCGATAGATACAGAA 
CCTCAGCCGGCAGTAGATTCCGCTTGGGCAATAGAAACCACCTGTGCGCTGTTTCTTCCA 
GCAACCGCGTTGGCAACTGTGATTGAACAGCATCCAAGTTTTGCTTTGGCGATGATTCGG 
ATGCAGCAGCAACGTTTGGCTACAGCCAGAGATCATGAAATTAACCTGACTACGACCACA 
GTTGAGCAACGAGTAGCTATTGCAGTGAGAACTCTGGGACGAAAAATCGGGCAACGACGA 
CCCGATGGAATCTTGCTCATTCAAGTTCGAATCCGGCGGGAAGATGTTGCGGGTTTAGCA 
GGCACCACCGTGGAATCTACTTCTAGAGTTTTGGCGCGATTACGTAAAGAAGGGGTCATT 
GATAGCGGTAGGGAA 

>RXN0097 8 -downstream 
TGATTGGCGTGGTCGATGAACGG. 

>RXN01081-upstream 

CATTGGTATTTGCTCGAAAGCACAACTTAAAACAGTAATAATGGTAATCACATTCTTTTC 
TATTAAGTAAGCAAGTACACCGGCCCATTAAAGAGGCACC 
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>RXN01081 

ATGACCCCAGCAAACGAAAGTCCTATGACTAATCCATTAGGTTCTGCCCCCACCCCAGCC 
AAGCCACTTCTTGACAGTGTTCTTGATGAGCTCGGTCAAGATATCATCAGTGGCAAGGTT 
GCTGTCGGAGATACCTTCAAGCTGATGGACATCGGCGAGCGTTTTGGCATTTCCCGCACA 
GTGGCACGCGAAGCGATGCGCGCTTTGGAGCAGCTCGGTCTTGTCGCTTCTTCACGTCGC 
ATTGGCATTACTGTTTTGCCACAGGAAGAGTGGGCTGTTTTTGATAAGTCCATCATTCGC 
TGGCGTCTCAATGACGAAGGTCAGCGTGAAGGCCAGCTTCAGTCTCTTACCGAGCTTCGT 
ATTGCTATTGAACCGATTGCCGCGCGCAGCGTTGCTCTTCACGCGTCAACCGCCGAGCTC 
GAGAAAATCCGCGCGCTCGCAACAGAGATGCGTCAGTTGGGTGAATCTGGTCAGGGTGCG 
TCCCAGCGCTTCCTCGAAGCGGACGTCACTTTCCACGAGCTCATCTTGCGTTATTGCCAC 
AATGAGATGTTCGCTGCACTGATTCCGTCGATTAGCGCGGTTCTTGTCGGCCGCACCGAG 
CTCGGCCTGCAGCCTGATCTGCCGGCGCACGAGGCGCTAGACAACCACGATAAGCTTGCC 
GACGCCCTCCTTAACCGCGACGCCGACGCCGCAGAAACTGCGTCCCGAAACATCCTCAAT 
GAGGTGCGCAGCGCGCTGGGCACGCTGAAC 

>RXN01081 -downstream 
TAACGTGATACGCGCACTGCGTT 

>RXN01160 

AAATCATCCAACAAAATCAGCGACCTTGCCCGCCAGCTTAATCTGTTGCCGTATTTCACC 
AGGTATAAAGGCCGTACCGTCATGGAAGCAGCGCGCGATCTTGGCCAACCCTCCTCCCAA 
ATCATGGAAGACCTCAACAGATTATGGATGTGTGGTCTGCCAGGACTTCTTCCAGGTGAC 
TTGGTGGAGCTTGATCATTCCTTTAAGGAAGTAAAAATCCACAATGCTCAAGGCATGGAT 
AAACCCTTGCGCCTCACACCAACTGAAGCCGGTGTTTTGCTGCTGACACTTGAATCCCTG 
GAATCCCTCCCCGGTATTGCGAAACAGGAAGCGGTCGTATCTGCTGCGAACAAGCTACGC 
GCCATCATGGGGGAGTATTCCTCGACTGTTTTCGACTCCACTGGAGAAGACCTCGATGCT 
GAAGTTCTAGAGATCATCCGCGACGCCATGGATTTACACCAGCAGGTCAGTTTTGAATAC 
CACTCGCACAGATCAGACAACACCAGCCTGAGGCAAGTCAGCCCTGCTCATATCTTCACC 
CATGAAGGCGAAACCTACATCAAAGCCTGGGAAGAAGCTGTGAACCAATGGCGGACGTTT 
AGGCTTGATCGCATCCGAAGCATTGTGCTTCTTGACAGCAAAGCAGTGCACCCGGCGCGA 
GGGGTTTCAGTATCCACGGACGATCCTTTTGAGTTCGCAAAATCTTCCGATATTGCCACG 
TTATTGCTGCGTGAGGACGCAATGTGGTTAGGCAATTACATGGCCATGGAGGTGGATGAA 
ACGGTGGAACCGATTCGCGATAGCGACGGATTCAGCTGGCACACAGTCCACTTTCCGCTG 
CTTTCTAGGGATTGGTTCGTCCGATTCGCGATTGGCCATGCTGAGCATTTGAAAGTAACT 
AGTCCCGAAGATCTTCGGAAATGCATAAAGCAAAAGGCTTTTAGTGGTTTGTCAGCGTAT 
GAT CAT C ACG T AG AG 

>RXN011 60-downstream 
TAACACCCAAGAGTAAGACGCAA 

>RXN01211-upstreain 

GACAGCTGCAGCGGTGTGGGCGGCGAACCGCTACATGCGCTGGGACTCGTACCGCTAAGC 
CTGCAGCCGACGGGATTAAGGCAGCTAACATTGAGACACG 

>RXN01211 

ATGAATAAAGATTTCTGGACCGCAGGCTGGACCGCCCGCTGGTTTTCGCGCGGGGTTTCC 
CTTTTGGCCAGCCCAGTTACCGCCCCACTGAACTCTTGGCGGAGATTGCCTAACTTGGCC 
AAGTACACCCTCTACACCAGGGTGTCGTTGCAAGCGATCCCCGTGGTGTTGCTGTCGGCG 
TATTTCCTGGGCATCGTAGCTAATGCAGGCACCCTGAATCCCTCATTTGTGTGGCTGCTG 
GGTTTCTCGGTCATCCTTTTAATAGTGACGGTATTGGTTTATGAATATCAGCCATCGCTG 
AATTCTCATCCTAGGCGCAGCGTACAGCCGTTCTTCTTCACCGGGTTGGTGCTCAACGTT 
TTAGGCGTTGTGGTGTCTGTGGTGCTTCAAATTCCGGGCTTAAACATGTCGGACAACACC 
CGAGCAACTGCCCTTATTTTCACTCTTACCTGCGTATTTCTGCTTTCGATCGCCTACATT 
CCGTGGATGAATTACCGATGGGTTTGGCTGATCGCAATGTCTGCAGTGTTGTGGTGGACC 
AGCACAACGACTGATTATTTAAGTGCATTGTGGGTGGTTATCCCGCCACTCATGGCAGGA 
ACCGTCCGACTTTCCGTATGGACCGTCGATGTCATGAAAGAGGTTGAGCGTTCCCGCGAA 
TTGGAAGCCTCCCTCCGCGTCACCGAAGAACGCCTTCGTTTCGCCCAGGAACTCCACGAC 
ACTTTAGGACAACACCTGGCGGCAATGTCCGTGAAATCAGAACTGGCGCTTGCCCTGGCG 
AAACGCGGCGACGACCGCCTCGAAAACGAGCTGCGTGAGCTCCAAAAACTCACCCGCACC 
TCCATGTCGGAAATGCGCGACGTCGTCTCCGGCTACCGCACCGTCAACCTCGCCACGGAA 
ATCGAGGGCGCTAAAAGTTTGCTTGCCGACGCCCACATCCACCTTTCCGTCATCGGCACC 
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ACGTCCCAGGTGTCACCCGCTCACCGAGAACTGTGCGCGTGGCTTGTCCGGGAAGCCACC 
ACAAACATTCTGCGCCACTCTGATGCAACGGATGCCACCCTCACGTTGAGCAGCACAGAG 
GTGCGCATGGACAACAATGGTGTGAACAAGGACATCGGCAGACTCTCTGGTCTCAGCGCC 
CTGCGCTCACGAGCGGAATCAGCCGGAATGACGCTCATTGTGTCCCGCGAAGACGACCAG 
TTCAGCGTCCGCATGCTCATTAATGCACCTGCAT^TACACCTGCAGAAAAGGAAGCT 

>RXN012 1 1-downstream 
TAAATGATTTCCATTTCCATCGC 

>RXN01315-upstream 

CATGTGCGGTTGTGTCCTAAGTCATTGATGTCACCTAACGCCCCTAAGTTCACTCGGTGC 
AATATTGCACTGAGTGCAAGTTTACACTAGGTTTACTTCA 

>RXN01315 

GTGGATATTGAAGAGCAGCCCTCGTTAAGAGAAATCAAGCGCCAAATGACCCTGGAAGCG 
ATAGAAGATAACGCAACCAGGCTCATTCTGGAGCGTGGCTTCGACAATGTCACAATCGAA 
GACATCTGCGCAGAGGCAGGGATATCCAAGCGCACATTCTTTAACTACGTGGAGTCCAAA 
GAGTCTGTGGCCATCGGGCACACAGCCAAGCTCCCAACGGATGAAGAACGTGAAGCATTC 
CTGGCTACGCGTCATGAAAATATTATCGATACTGTATTTGACCTGGTAATCAACCTCTTT 
GGCAACCACGACAACTCCAAGTCTGGAGTTGCAGGCGACATTATGCGTCGACGCAAAGAG 
ATCCGGGTGAAGCATCCAGAACTGGCAGTGCAACATTTCGCCAGGTTCCACCAAGCACGC 
GAAGGGCTAGAACACCTAATTGTTGAGTACTTCGAAAAATGGCCAGGCTCCCAACATCTA 
GATGAGCCTGCAGATCGAGAAGCAATCGCCATAGTTGGCCTGCTGATCTCGGTCATGCTT 
CAAGGTTCTCGTGAATGGCACGACATGCCACAAGGCACGCAAGCTGATTTCCAAGCCTGC 
TGTCGCAAAGCAATTAAAAATACTTTTCTTCTTAGAGGTGGATTTTCAGAA 

>RXNO 131 5-downstr earn 
TGACATCACAGGTCAAGCCGGAC 

>RXN0134 9-upstream 

AGTTGTCAGCTGGGATTTGCCGATGGTGTTATCTTTGATATTTACCTTAACCCCGTTTCT 
AATTCACGTTTTCTCGTTACCCGAAAGGAATTGATCGATT 

>RXN01349 

ATGGCGACATCACGTCGAGATGCCGAAAACATAGACCAGGCCGGTAGCGAATTCATTGAA 
TCTGATTCAGGACACACCGCAACCCCTGAAGAGGTAGTAGCCACCGCTCTGACATTTTTT 
GCAGAGGATGGTTTTAGCGAAACCAAATTGGAGAAAATCGCGAAGGCATCTGGCATGTCC 
AAGCGCATGATCCACTATCACTTTGGCGATAAGAAAGGCCTGTACATCAAGGCTGTTTCC 
TACGCGTTGCGATTGCTGCGCCCAGAGGCTGAAGCGATGCAACTTGATTCCGCGGTACCA 
GTTGATGGTGTCCGCAAAATCGTCGAGGCTTTATATACCTGCATCACCAAGCACCCAGAA 
GCAGTGCGCCTGCTATTGATGGAAAACCTGCATAGCCAAGACAGCGTGGATTCCACCGCG 
GCATATTCCGATGAATCCAATGTGCTGCTCAACCTGGATAAGCTGCTCATGCTTGGCCAG 
GATGCCGGCGCCTTCCGTCCTGGAATCTCCGCAGAAGACGTACTGGTTCTTATTAGCTCC 
CTGGCCTACTTCCGCGTATCCAACAAGGTCACGTTGAAGAACCTCTACTCCCTTGATTTG 
GAATCAGAGGCCAATATTGAAGGCATGAAGCGCATCGTCGTTGACACGGTGCTGGCATTC 
TTGACCTCAAATATTCAAAATTCTGGCAACTCCAGCTACCTGGTTGTTGGTGGCAAGACT 
GCAGAACCAGAAACTGATGACAGCGTCTACAGCTTTGATACGGACGTGTTCGAAAAC 

>RXN0134 9-ciownstream 
TAAAGGGTATCGAGTAGTTTCAA 

>RXNO 13 68 -upstream 

TCGCTTGGGAGAAGGAAAAATTTTCCAAACCTGCTCGGGGACCAACATCCCCGATGAGAA 
CAGACTTAAAGTAGCTTTGAAAGTGAGAGGGGGGCGAGTA 

>RXN01368 

ATGGAAGATTCAGCTGGGGACGTATCTGCAAAGTTGAAAGCAGGCCAGACTCGCACCGCA 
CTGGAGATGACTTTGGATGATCTGTTCGGAGCGGTTGAGCAAGAATGGCAGGAGCAGGCG 
CTGTGTGCGCAAACTGATCCTGAAGCATTCTTTCCAGAAAAAGGTGGCTCAACTCGCG/^ 
GCCAAGCGGATCTGCCAGGGCTGCCCGGTTCGGGATGAATGCCTAGAGTTTGCTCTTGAG 
CATGATGAACGCTTTGGAATTTGGGGTGGTCTCTCTGAACGTGAGCGCCGCCGCCTGAAA 
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CGCGAAATTTCG 

>RXNO 1368 -downstream 
TAAAACTTCAAGACCAGTAAGCG 

>RXN014 4 5-upstream 

GAGTCTCGAATGTATAACGAGTTTGTTCCTAATAACTTTATCATTAACTATGCCTGCAAT 
AGGCTTGGGAAATTTTGGCATCCAGGTAATCGCGCTGTCA 

>RXN01445 

ATGATCCCACTTATTAATGTACGTTTTCCCGTTGCCGCCTTACCTCTCGCATTAGTGGCG 
ACTGTATGGCTTAATGCTTGGGCAGACCATCTTCTCCTAACTGGTTTTATTGTTTATCTT 
GCTGTGGAATACGCAACAAGCCGTGGGCGCTTCGCTCTCGCATTGATTTTGGGAGTTGAA 
TGGATCTTAATTGCTTATGGGGTAGCTTTGGAAAGGCCTCTTGAGGCTAAAGACTCTCCA 
TCTCTCATTACCGAAATTTTGCTCATACTTGTAGCAGCTGGCACAGGGGCAGGTCGGTGG 
AA/VATTTTGAGTGAACGCAAGCAACGTGCAATTACTCAGCAGGAAATCATCAAAAAAATC 
CGTACTGATATAGCGCACTATTTGCATGACAGTATGGCAAGATCGTTGGCAATAATGATA 
GTTCAATCAAAGCTGACTGAACTAGAGCCTGATCCAAAAAAGATTCAAGAAAAACTAAAC 
AGTATTGCCAAAATTGGACAAGAGGCAGTGGCTGATTTGCATCAATTAGTTAGACACCTC 
GTGGTCGAGGAGTCTGCTGAAAAAGCCACAGCGTTTGGAGCATGGGCTGCAGTTTCTATT 
CATGACACGGTTAATTCTGCCATTCAGTTATTAGTAGATGCAGGACATGTCGTTTCCTTT 
GACAGTAGAAAAAAGAACTATAAGCTGGACCATATTGCTGAAACGGCGTTTGCTTTAGCC 
TTCAATGAGGCAGTCTGTAATGCAATTAAACATTCTCCGCCCAAGGCAAACGTTACTATT 
CGCATAACAGAAAAAGCACAGTCTCTTCAGATTCTAGTAATGAATCCTATTGGAGATTGG 
CATGCAAATGGGGAGTCCGCAATTCCAGGTGTGGGCATTGGCGTAGAAAGCTTAACCAGA 
AGGATACGTAATATTAAAGGACAGGTCTGTGTGACTTCACTGCAAGGATACTGGAAAGTA 
GTTATTTCACTACCTTTGAAATGTGAGGATTCT 

>RXN014 4 5-downstream 
TAAATTGTCTCTATTTGTTGAAC 

>RXN017 7 3-upstream 

CCCTAAAAACAACGACAGCAAAGCTGCGTTCGGATATATTCTCGAACTGCTGTAGTGCAT 
ACGCTACAGCCCGCTACCACCGTAAAAGCGAGAACATATC 

>RXN01773 

ATGACTGTTGATCTCTACCAGGCACGCATTCCTTTTCAGCGCGATGGCGTGCGCTTTGAT 
CATACGATGATCACCCACATTCAAGCCGGCCTGCATCTTGGTGGCTGCCGCGCAGCAGGT 
TTACTGCCTATACCAGCACATATTGATCATATTGTGCGCCTGACAGCCGCAGATTTCTAT 
GACACCCAGTCAGCACCGCAGCTGCTCAGCAACACTGTGCTTGATGTATTGGACACCACC 
ACTCAAGACTTGAAGGCATTGTGGCCTGTTGCAGAACATATTGCTACAACCATTCCTGAA 
TCTGAGAACGTGCTTATCCACTGCCAGATGGGTATCAACCGCTCAGCTGCACTCATGACA 
CGGGTGTTGATGTTGCGCAACGATTGCACCGCCGATGAAGCAATTGCACTGCTGCGTGAT 
CGACGCTCACCGTTTGTACTGTTCAATGAGCATTTTGTGGAACAACTTCGAGCACTG 

>RXNO 17 7 3-downstream 
TAAGCGCTCAAAGACCCATTACC 

>RXN018 4 5-upstream 

GGCTCCGTCCGTACCGGTGGTGAAGTAGGAGTAGCCGATGAACCCGTTGTCGAAGGTGGT 
TGCCTGCTCCGAGACGAGCACCGCACCCGTGCCGTCCTCA 

>RXN01845 

ATGATCTCCAACTCCTGGGCGATTGAGACCACCTGCGCGCTCTACCTGCCCGTCGAGGCC 
CTGGCCGAGGTCGTGGACGCCTACCCGCAACTGGCGTTGGCGATCATGCGGATGCAGCAG 
GACCAATTGGTCCGGTCCCGGGAACGCGAGACCGCACAGACCACCTCGACTGTCGAGCAG 
CGCGTGGCCGCCGCCCTCCAACACCTGGACGCCAAGCTCGGGCAAATCCGACAAGACGGA 
TCCAGCCTGCTGCAGGTCCGCCTGCGCCGCGACGACGTGGCCGGCACCACCGTCGAGTCC 
GCCTCCCGGGCAATGGCGCGGATGAAGAAAACCGGTGTCATCGACTCCGGCCGCGAATGG 
ATCGCCATTACC7VACCACCAGGCCCTGGCCGACCTGGTCGCCGGCCTC 
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>RXN018 4 5-downstreain 
TAAACGCCCACATCCCTCTCTCT 

>RXN020 97 -upstream 

TTGGCTACAGATGTTTCAGCGCTTGCAGTGGGGTAGTGTGTTTAAACATCACAATTAGTT 
CTAGAGGAAAACGCATTTTTTGCGCGGGGAGAGTGTATAC 

>RXN02097 

ATGCCGGCTGGCATCGCAGACATGACAGATTCATTGCTCGGATGGGCATCACAAACTGAG 
CTGGATCTGAACCAGCGTCTTGCAGGGGTAGAGTACTTTCCACAAATTCAGCTGCGACAC 
GATGAGCTCGAGCGCATTCATCGGTTTTACGGCACCTTTTTGTCCCGCCAGGTAGGCGCG 
GGCGCAAGCCTTGGGGATCTTTTTGAAATGACCCCATGCCTGACAGTCACCACCTTGGTG 
TCTCGGGCGTCACGGATCAGCGATCCAGCAGATTTCTTCGGTGAATACATCGGAGGACTG 
GGACTTAGCGCAGAACACGCAGCAGTTGTTGAAGGGTTGACCGAAAAGCTCTTCGCACAG 
GCTGGCCTGCTCGTTCCTGAGGGAATTGCATCTCCATTGGAGTTGTTATCCATCCACGCA 
GGCATTAGTAACCACGAAGTGGCCGCAGTGCTGACCGAAGTGGAAAACGGCACCACCGAA 
TATCCATTCATGTTCGACGCTGTCCTGCGCCTAACCCCTGAGTGGGCACAGACCCTTATC 
GGCGGAGTTCAAGAACTCATTGAATTTGCCACCACCCACCGAACTTCTTGGTCAGACCGC 
CAGCGCGAATCCTCACTGCCAGCCATGATCGATGAGATCGTTGTGGCGGAACTTCGGGAA 
CGCCCAGTTGGTACTGCCGACCGTGAAAACTCCGTTGGTGTGGCACTTCGTGAGCTTCGC 
CCACGCCTCATCCTGGATGCAGAACGCCGCAAAGTCTGCCTGCGTCTACCTGAACAGCGC 
GTCAGCGACGATGAAATCAACTGGCGAGTCAGCCTAGAAGGCACCACCCGGATTTTCTCC 
ACCCGCCGAGCATGGGGCGATACTTCTGGATACTCCGAAGCCCTCGACATCACTGTCGAG 
CGTCAAATCCGCGAAACCACCGTCACCGACACCTCAAACCAAATCACCTGGGTTGTCCCA 
GTCGTGGACTTCAACGACCCAGTGCTGGTGTTTTCCGCGCGCGGTGAAAACCTCACCGAC 
AAGGTCTCCCTGCACCATCAAGAGATTTACGTTCTCGCGCCAGCGGAAGCAAAACTCGAA 
GACATGGTCACTGGCCAGCCAGTACCAGTTATTGAGCAATTCCTCGTAGAGGGCTGGAAC 
TCATGGGTGTGCTCCCGCGTGGACGCCCGTGGCCTGTCCTCTCTGAAGGTCAACAAAGAA 
GTCCGATGCATTGACCCACGTCGACGCGTTGCCTTCCACCACCCAGCCGAATTGGTCCCT 
CACGTACGATCCATTTCCGGACTCCCCGTACACGCGCAGTCCCTGATCGCCGAGTTCCCA 
CCAACCCTGAGCGGACAAGACGAAACCTGGATGCTCTCCATCTCGGCTTTCGCAGGTGTA 
GGCGCTGCTGGTGAAGAAATCGCCGAGCCAGAGCCTTTGGAAGTCCCTGCCGACGGTGGC 
CTTTTCGCCATCTTCGACCCAGAAATATACGACGCCCCATGGGTGGGTGAATACCTGGTC 
CGACTCCGCGGCCCACGCAATGAATCCTTCCGACCCGAATTCGCCATCGTCGAAGACATG 
ACCACCGAATTCGAAGTCGCCTCAGGTGCATCATTTCGAATCCCAACCACCACTGGTCTC 
AGCGAAGCCAGCCTACGCGTGCGTTCCGGTGAAAAGCACTTCACCGCAGAGCCACGCCTG 
GTCACCGTTGAAGCAACCGACCCCAACGCATCATTCGTGGTCACCACCGATGAAGGCGAT 
CAAATGCCATTGCGATTTGTGCCACCACAAATCGCCATCGAACTTCCACTGACCACCGAG 
CCACCAACCTGGCGCGTCACCCGTACTGTCTGTGGACCACGCGACCTCGACGGTGCAGGC 
GAACTCCGCATCCGCACCGGTGTCGATGTCGGCGATCCAAAGGTCAGTGTGCGCAACCAC 
CACGGTTCACCACTGCGAACCGTGAAAATGGTCACCCCTGACAACGGCCGTACCTGGATT 
GCCAGCATGAAGGAAATCGCAGCCAGTACCTTTGTGATGCCACGCGGATCCATCGAATTT 
GAGTGGACTGACCGCAAGGTTGACCGTCGCGTTTCCGTGACGATTGCTGTCATTGACAAA 
ACTGAGAACTTTACTGGCATCACCATCGAAGATGGAAAGCTCGTATTCGAAGAACTCGCA 
GCCGGTCGCCAACTCGCTGCATGGGTGTGGCCACAAACCGCACCGTCGGTAAGCGCAGTG 
GAACTTGCTGTCACCGGACCAGAGCTGGAACTCCCTGAAGTTCTCGTCGGCGCAGGCAAC 
CTGATTGTTCAACTCCACACCGCTGACCCATTCACTACCTCCGTGACCCCACTGTCACCA 
GGAAAAGCTGCGGTCACCGTTGAGCAAGAAGGCTACTACTCAGCACAAACCGAAGAATAT 
GCACAGCTTTCAGCATTCTTCGGTGGGGAAGTAGAAGAACCACCAATCAGTGACGCTGTG 
GTCCCCGCACTTTGGGATGTTTCCCATATCTGGACCGAACAGGGAAACACCGAGCATCTT 
CCAGTAGTCCATGCCGCCCTGCGCTCCTCACCAGCCGCAGCACTGAAGGGTCTGTCCGCT 
TCGCTGGTTCCCGCACAGGCACTACCTGGAAAAGTCATTTCCTCCGGACTGGCAGCCTCA 
CCGTTCACCACGGAATCACCAGCAACAGAAGTGCACCGCACCGCATGGATCGGAACCCTG 
CAACTCCTGGGTGCACTGCCAAGCGCATTCAAGGAAGCCGAAGAGCTTGGCAACCGCACA 
CCACTGCTGCCAATCCTCGGACAACTTGAGGAAGTCGCCGGCAAGAACATCCTGTCCACC 
CTTGCAACTGGCCGTGACTCCACTTTGGACACCGCATGCATCGACCAATCCACCGTTGCG 
ATTGCCGGCATGAACGAAACCCAGCAAAAAGCCCTGCTGGACATGTTCTTCAGCAACGCC 
GACATCGTTCCTGGACCACTAATGGAAGACAACACCCGCCTCATGGCAGTGTTCGAAACC 
TTCAAGAAGCGCGATGCACTCCGTGAGGTTCTCCAGACTGAAGGCTTGATTAAGACCGCT 
GTAGAACTTCTTCGTGCCATGCGTGGAACCCAGCGTCAGCTGTATTCCTCCGCACGTATT 
CGATTCGACAAGCTCGATGGTGTCAACACTGACAACCCAGAAAACATGTGGGCACTCACC 
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CCAGTTGTGTCACTGGTGTTCGCGTTGTCATCCCGTTTGCATGCACACGAATTGATCGGC 
AAGACCCGAACTCTCGATCGTGCATCTGCCGGTTGGGGTCGAATCGCTGATCTGGTGCCA 
GACCTTGTCACCGGTGACTTGATCTCCGCGGAGGCAATGGTTTTGGGAGCTCGAAACCCA 
GGACTCGTCGAT 

>RXN02 097 -downstream 
TAGTCCCTGATTTCATCGGAGGG 

>RXN022 6 6-upstream 

GAGGAAAGTAGCGCTACATCTGCATATCTACCCCCTTAAAAATGAAGCATAT^AACCGCCG 
TGTACCGGCCTTTTATTGATTTTGACGTAAGCTTGCACCG 

>RXN02266 

ATGACTCAAGATGAACACCCCCGACAGGCCGACTCCCATTTCAACATGCTTTTACCGGAT 
GGAAATGAAAACGCACACCAGCTTTCTGTCGCTCTAAATCAGGTGGCACATCTGTTGGCC 
TATGATGCGGACTCTTCAATTCATCGGCCTGATGGGCTAAGTCTGGCGTCCTATAGAATT 
CTCTTTTCACTGTGGACTGATGGCCCGATGAGTCCACTCCAGGTGACTGACAAGACTGGA 
ATGAAAAAGTCTGCGATTTCTAACCTGTTAAAGCCATTGCTCGCTGAATCTCTGATTGTG 
CAGGTGACGGCAGAAAATGATCGACGCTCAAAGGTTTTAAGCCTTAGCGAAAAAGGCACT 
ACATACATTCAGAAAACAGCCACCCGCCAAAATGCTTTGGAATCCGAGTGGTTTGGCACC 
CTGACCGACATCGAGCAGGATTTATTGGAGTCGTTGCTCAGGAAACTGCTCGACTCCAAC 
CGCGCATCCAAGGTTCGTAAAAACCGATCTAAC 

>RXN022 6 6-downstream 
TAGCGTCGATCCTTAGGGATGTA 

>RXN0227 0-upstream 

CAGCTTGAGCAGGCATCGGCCGATGACGTACCAGGAGTTTCGGCATGATCCGGCGGCGTC 
GCATCGCTATTGGGCGCGGTCGTTTGTGGGGTGGCGGGTG 

>RXN02270 

ATGGATCAGGCGCGGCCGAATCGAACGCACTACGCCATGGTTGAGCTGGAGCAGCATGGT 
TTTTTAAGTGGTGTGGTCACCCAAAATGTCGATGGTTTACACGCGGAAGCAGGCACGAAA 
AACCTGGTCGCGCTGCATGGTGATCTCGCCCATGTGATGTGTTTGAACTGCGGTTTCGGG 
GAGGATCGACACCTCTTTGATGAACGTCTCGAAGCCGCCAACCCCGGCTACGTCGCTTCC 
ATTCGCCTGGAACCGGGCGCAGTCAACCCCGACGGCGACGTCTTCCTCGACGAAGAACAA 
GTACGCCGCTTCACCATGATCGGCTGCTTGCGCTGCGGCTCGCTCATGCTCAAACCAGAC 
GTGGTTTACTTCGGCGAACCCGTGCCCGCCGCGCGCAAAAAAGATTTAAAAAAGCTTCTC 
GACGCCTCCTCCAGCCTCTTAATCGCCGGCTCCTCCCTAGCCGTCATGAGTGGATACCGG 
ATCGTCATCGAAGCGCAACGTCAAGGAAAACAAGTGTCTGTCATCAACGGCGGCCCAGGT 
CGGGCGGATTCCCGCGTGGACATTTTGTGGCGCACCCGCGTTGCACCGGCCTTTGATGAC 
ATTTTGGACGCGCTGGACCTT 

>RXN0227 0-downstream 
TAGACTTTTGGTGGCTTAAGTTC 

>RXN02 362-upstream 

ACAATTGTTCCCATTCGCATATCTCGTGTCCACCAGAGTCGCTACAGTAACGAGAAACTT 
AATTATTTGATCCGATTCTTCCGTTCAAAGGCCTCATTCC 

>RXN02362 

GTGACTATCTCTCGCCGACTCAAACAAGAGCGCAGTTTCGCTGACGATCTTCAAGATCTC 
AAAACTCTCAATGATCAACTGCGGTTTACAAACGCCAAATTGCAAGCTCGCATCAGTGGT 
ATTGGCAATGATGGAAAGAAAATCACGCGCCCTACCCCACTCCTTGCGCTGGATTTTCAG 
CTGACCGTTGAAGAATACGAAACGATCATTGCAATCTTGGTGGAAGCAGTTGGCGGAAAT 
CAATCCAAGCCAGCGATTCTTAAAGATCTGTTTATAGAATATCCACTCGTCTTCCTGGCA 
GCGCTTTCTGGAACCGCCATGCTCGATGCTCAAGAAGGTTTCTGGCCTGCGTTCTGGAAA 
CGCACTCAGGTGTCAGTTCCAGAGCATGTATACGACGCGATCCGTAAAGAACTAGTTAAT 
AGCATCCGCAAAAATGGCCTAGAAACTTTTTCTCTCGCTGACCTCAATCGACGCGAATAT 
GTCGGACTCATCCAACTTCACAGTGGCCTTTCTGCAAAAGACATGCTCGCCTTGGTCAAA 
TTTATCGATCACACTCGAGCAGAAAACCAAGGATGGGATTCTGGTGAGGACTTTGCATCA 
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TATGCGAAGAGTGTCTTCTCCTCCGGGGACAACCTATTAACCACGGAGTCGCTCAAGCAA 
TTAGTCACCCACATCCCTGCGCGTTCCGTCGACTTCATCGCCAGAGTCTATGAACTAACC 
AATTGGTACCGCGACCTCAAAGACCTCAATGAAGTAGAAGCCTTCGTAGGTACTCATGGG 
CTGCCGGAATTGTCTTTCAAATTTCTTCTGGAGTGTCTGAGCGGCGAAGCTGAACAAATT 
GCCGAAAAGACGAAAGCAGCACCAGCAAGCCTGGAAAACCTGGAACCTCCGCATCTCTAT 
CTGGATCCACAGAGTTTTGAACTCAGTCTTGTTTTCCCAGCGATCTCTAAAACTGCAGCA 
CTTCAGATTCCAGCACCAGAATGGACAGTGATTTATGACGGAAACTCCATTAAAGTTCGT 
CCCGAACAGGACTGGTCCTACGGAGGTTTCGCCGAATACCGTTTGCCTTTAGACAAACCG 
CTCTCCAGCTTGAGAGTCATCACTCCAACAGAGAAATCCCTAATTCTGATTGAAGGATTT 
GGCCACAAGAATCCCATTATGTTCTTTAAGAACAACGGTCAGCCATATGCAAACCAAGAA 
ATGCTCAGTGGCAACGCTGTCACAGCTATAGTCCCAGCTGCAGCAATCATTCGTGCACGT 
ATGCGAGCTTCCAAGACTTTCAACTATCAAGACTTGGGTCCCTTGTCCGGATGGAACAAG 
TGGGTCATTCGTTCGATCCCACTCAAACGAGCTGAATCGATCACAGTCTCCCACGGTGGC 
TTCAGAAAAGAACTCCCAGTTCGACGCAAAGTTGATGTTCAATGGATTACTGAGGATCTC 
ACGATCGAGAATCTTCAAGGTCTCGATCATGAGCCCGTTTTCCACACGAGTCCCCGCATC 
GAATTCCCCACCTCTGGATCAAACTGGGTAATTCAGTATTCACAGATTCTTCCAGATGGC 
AGCCTCATCGAAATGGAAGATTACCCAGTCGAACCTGAAAACTTCGGATACGAACTAGAC 
CTCTTCGAAGAATCCGACGACCCTTGGGTCGGGCAATTTTTAGTAACTCTGCTCAAGGAT 
GAAAAAGTCTATGAAACCCGCAAATTCAATCTCGCGGAAGGCCTCGATCTTTCCCTAACA 
TTCAGCGGAGGCGGACCTGAAAATCGATTTAGGTACCCCAGCATCAATCAGGGACAAACT 
GGCTTAACAAAGACTTTCGCCCGTTTTAGTTCCAATTCTGAAAAGCACATCAGGTTCCCA 
GATGAGATCATCGGGCTTGATGCATTCACCTCTCAAAAAGCGTTTAACATCGCAAGCGGT 
GATTTCCCTGAGGACTACAACCTCGACGTTTTCATCACGCCTCCGCAACTTCACTACCAA 
GTACCTGTCACACACAGCCAAACAAAGTGGGAAAGCACAAAGACGACACTAGATTTCAAT 
GACTTTGCCGATGGAAACCTCCAGATCAGATTCCCTAATGAAGTCTATGATCCAAACTTG 
AAAATCATTAAAATGGTGGCATACAAGAAACCTGAGTCCAGTGAGCCTAAATACTTAAGC 
AAAATTGGTTCAAGCAAAGTGTGGTCTATCCCTATGGATCGCATCAAGGAACTCATGGAT 
GATGATGCCCAATTCCTTTTGATCGCGGAGTGGTTCGCTGAAAGTAAAGACCAGCACCGA 
GAGAAGATCATTAGCGAAGCTAAGCGAACTGGAAAAATCTCCAATGCAGCGCTTAAGAGT 
GCTCGTCCTCAACCTCAAGCAAGTTCCCACATTGCAACAATTGAGAAAAAGCCCCTACTA 
GCTGCGGCTGAAATTAAGCTTTCTACCGTGGAGTTGGAACTTGGTCGGCACACTTCTAAG 
AGACTGGAAGGCTGGGCATGGTCTGCGCTCAACCCGCTTGATCCACCAATCAAAGTCGAT 
TTCCAAGGAACCTCAGGCTCACTTCCAGACACCCACTTCGTCGTTGGCCCTTTAATCGTG 
GAAGTGAGAGAAAAAGAGTTTCTCTCCCAATGGCAGCCAAAAGTTCCCTCAGTTAAAGCC 
GTGGTTGCAAATGATCCCTCATTTGAATTGGACCCTCAATTTGATCCTTTCCTCACACAC 
CGATGGATGTTCGCTCCACGAAGTGGGAAGGTCTTACTCCCACAAGAAATCCGCACAGTG 
TGGGACGCCCGATTCAATATGCGCCATGTCTTAGCGCAGCGTGAAAACCTTCATGTGAAA 
TCGATTCAAGATTTTGACGATGCCACCAGTACCTATCTCACCAGTGATCCTCGGGTGGCA 
TTAGATGAATTGGATAAGAGCTCAATTCCGTCTAATTCCCACTTTGAATCATTCATCCGA 
TCCGGATTAGCTGAGCTTTCTTTCGAAGTTGACGACACAGCCGGAGATATCCATCGCGTT 
CCCTGGATCGGCCTGATCCAGGAAATGAACGACCTCAGAATTCTGCAGATACAAGGCTAT 
GAAACAGAAGAACGAGCCATCGAACGCCGAAATTCGCAGAGCTACATCCGTGAGATAGGA 
GGCAGTGAATTGTGGAATATCCTAAAAGGAAATTCAGAGGGATTGTCTCTTGCTCAAAAA 
TGCGCACCACAAGCCACTGAGATTAATGTGATTCGTAATTCAGGCTTGGAAGCTATGCGC 
AATGGGCTGGGCGCCGATCAGTTCAGCGCCGAGTTTATTTCAGCAGACTCACGCCTACGA 
GCTCAGCTTG/^TGGTTGGAAAACCGCCGAGAGCTCAATGATCTCGGCCAGCTCCCAACG 
CTCTTCGATTTCGCCGAGAAATACGAGTACCTCATCGATCACTTAGGTGATGATCGCATC 
AAGGTCACTGCACGTGAGCTGTCTACTCTTGCGTCGGAACACCGTCGCGGCAACGCTGAA 
AACTGGCTTTATGCACCATATGTGTCATTCATTTACAGCTTGCTTAACCGAATGATCGCT 
CATGAAGTAATACGTCCGATCGCTCAGATCAATTACTCACGGCACGATTGGGCAAACGCT 
GCTCGGCTGATTCCTCGTCTCACAGGATTTGACCTGGTGAGTGCCGAAGCGAAAGTGCTC 
AGCGCAATAAACAACAACAATATAATCCCAACTGCAATT 

>RXN02 3 62 -downstream 
TAAGGATCACTATGTCCAACGCA 

>RXA02 3 65 -upstream 
G AAT T T G T G GG ACCG AC A 

>RXA02 3 65 

ATGGAGCTGTGCCAGACTAAGCGTGGTCAGATGGGTGGCATGGACTACCTTTCGGAGGAC 
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CGCGTGGAGCTGCGCTACACCATGCCTTTGGGTGAGATCATCTTTGACTTCTTCGATATG 
TTGAAGTCTCGCACCAAGGGTTACGCTTCGCTGAACTACGAGGAAGCTGGCGAGCAGACT 
GCCGACCTGGTCAAGGTAGATATCTTGCTCCAAGGTGAACCTGTGGATGCATTCTCTGCG 
ATCGTGCACCGCGATAATGCGCAGTGGTACGGAAACAAGATGACTGTGAAGCTGAAGGAA 
CTGATCCCTCGCCAGCAGTTCGAAGTTCCTGTGCAGGCAGCCATTGGTTCCAAGGTTATe 
GCTCGTGAAAACATTCGTGCACTGCGCAAGGACGTGTTGGCGAAGTGTTACGGTGGCGAT 
ATTTCCCGTAAGCGCAAGCTTCTGGAAAAGCAGAAGGCTGGTAAGAAGCGCATGAAGAAC 
ATCGGTTCGGTCGAGGTTCCTCAGGAAGCATTCGTAGCAGCACTGTCTACCGACGAGGCA 

>RXA02365-downstream 
TAAAAAACTTTAGCCTCTTTTAG 

>RXN024 50-upstream 

CCAGACCGATCTCCATCAATTCACAGGCGAAGCGAACCGAAAACGATTGCGTTGTTCTAC 
ACTGATCAGAGCCCGTCCCTCAACAAGAAGAGCAACACCA 

>RXN02450 

ATGAATCTGAAAGATCTCAAGGCCGCAGAGACCCGTCAAAGGTTTATCGATGTAGCCCAC 
GAACTCTTCTTGGAGCACGGTTATGGTTCCACCTCCATGAATCAGATTGCTCAGGCAGCG 
GGTCGTAGCCGGGCAAACCTTTACCTTCATTTCCGTAACAAGCCCGATCTCAT GATGGCT 
AAAATGCGGGAACTTGAACCCGCGGTCCGCACCCCTGTCCTAAAAGTTTTTGATCTCCCT 
GAACACACTTTGGAGTCCATTCTTAGATGGCTGGACTCCATGACGGAGGT-GTGGAAAGCG 
AATGCCAAAGTGTTCGGGGCGATGGAACAAGCGATGGTCGAAGATGCTGCGGTGGCCGAT 
GAGTGGCTTTCAATGATGCAGAGGTTGAGCCAATCGGTGCCCGAATTGGTTGAGAATGAA 
GAGCGTCGAGTTCAGTTCCTGGCTAGCTTGATGGGCATGGATAGAAACTTTTACTTCCTC 
TATGTCCGAGGGCAAGATGTTGATGAGGAATTGCTAAAGTTGGCTGTGGCTCGCCAATGG 
TTGGCAGTTTTCCAA 

>RXN02 4 50-downstream 
TAGGCAATGCGCCCCAATCCCCT 

>RXN02 4 93-upstream 

GGTTCCGTAGTAAACCCAGGCGGCACCTACCTCGATCCTGAGGCAGCAGCAGCCGGCGCA 
GCAGCAGTAGCAAACCAGGGTAATAAGTAGCTATTTGTAG 

>RXN024 93 

GTGAGCACTCTTCTTGCTTTCGTATTGGGCGTGGTCCTCATGGGCCTCGCCCTACCTGCG 

TATACGAAAATTAAAGATCGGATGCGTCGCCACAAGTCCGCGGTCACCCTGTCCGAAAAC 

CAGGTCACCACGGTGGGGCAGGTCCTCCACCTGGCGATTCAAGGCTCCCCAACGGGAATC 

ACGGTTGTCGATCGCACCGGCGACGTCATCTTATCCAACGGCCGCGCCCACGAATTGGGC 

ATCGTCCACGAAAGATCCGTCGACGGCAACGTTTGGCGCGTCGCCCAGGAAGCCTTCCAA 

GACCAAGAAACCCACTCACTCGACGTCCACCCAGACCGCAATCCGCGGCGCCCGGGTAGT 

CGCATCACCGCAGTGCAGGCAGTGGTCAAGCCTTTAACGCTTATCGACGATCGTTTCGTG. 

ATCATCTATGCCTCCGACGAATCCG/yU^CGTGCGCATGGAATCGGCACGCCGAGACTTC 

GTCGCAAACGTCTCCCACGAACTGAAAACCCCCGTCGGCGGCATGGCACTCCTCGCGGAA 

GCCCTCATGGAATCCTCCGACGACCCAGAACAAGTCGAATACTTCGGATCCAGGCTCCAC. 

CGCGAAGCCCACCGCATGGCCGACATGATCAACGAACTGATCTCCCTTTCCAAACTTCAG 

GGCGCCGAACGACTCCCTGATATGGAACCCGTCCAGGCTGACGACATCATCAGCGAAGCC . 

ATCGAACGCACCCT^CTCGCCGCCGACAACGCCAACATCGAAATCATTCGCGGCGACCGC 

ACCGGCGTTTGGGTAGAAGCCGATCGATCCCTGCTGGTCACAGCCCTGGCGAACCTGATC 

AGCAATGCAATCAACTACTCACCAAAATCAGTCCCCGTCTCCGTTTCACAAAGCATCCGA 

AACGACGTGGTCATGATCCGAGTAACCGACCGTGGCATTGGCATCGCACCCGAAGACCAA- 

GGCCGAGTTTTCGAAAGATTCTTCCGCGTCGACAAAGCCCGCTCCCGCCAAACCGGCGGA 

ACTGGCCTTGGCCTCGCGATAGTCAAACATGTCATGGCTAACCATGGCGGTAGTATTAGT 

TTGTGGTCACGTCCTGGAACAGGCTCCACATTTACACTTGAACTCCCTGTATACCACCCA 

GAGTCCAAGGAACCGGCAGGATCTAAGCAGGGACCTAGTTTGGATTCACCTATTCGTACG 

ACTGCGTCCAAAGCATCTGGGCGCCGAAAGGAAAAATCA 

>RXN024 93-downstream 
TGACGAGAATCCTGATCGTTGAA 
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>RXN02506-upstream 

CCTTTCCCTGAACTCTAAGCAATTGTGATCTATAGTACAAATGCATAAACATTAACCGCG 
ATATCCATCTCTTGCATACCGGCCGAAAGGTTTAGCACAC 

>RXN02506 

ATGCACCTCAATCAGCTCGAATTTTTCATCGCAGTAGCCCAACACGGACAGATCAACCGC 
GCCGCCGAAGAACTCCTCATTTCCCAACCCGCTCTCAGCCGACAGATCTCCGCACTTGAA 
AAATCCGTCGGAGCTCCACTCTTCGAACGCCATTCCCGCGGTGTCTCCCTCACAAAGGCC 
GGAGAAATCCTCCACGAAGAAGCCCTCCGAACGCTTAGCAGGATGCAATCGGTAGTCGAT 
GAAATCCAATCCGGTGAGCACCTCATCACCAGCATCAACATCGGAGTTCCCCCTGGAATC 
CCCATCGACTGGTTGCGCTGCCAACTCATCGATTTAGGCCCCGAGACCCGCATTTCACTG 
ATCGAATCCCCCACCGATGATCAGCTAAAACTTCTTAAACAACGCGAACTCGACATCGCC 
CTTTGTCGACGCCAAAGCGAGGCCTTTGCCACCACACTTGTCCACGAACAAGAACTGGGA 
ATCGTCGTCCGAAAAAACTCCGAACTGCACCAAAAAGTCGCAGGAAAAGACAACGCCACA 
CTCTTCGATCTTGAAGGGCTTCGAGTCCTCGCACACTCCCGCGGTGAAGTAAGAATTCAG 
GAAGAAATCCTCAAAAACGCCATGCTCGCCGCAGGAGTTAATGCCACGTGGATCTTCCGA 
AAATTTGGGCAATATAGCTCACTGATCGCAGACCTTGTCCAGGCCGATGTCGCACTCACA 
ACAGAGGAATCCGCCCGCACCAACTTCCCCAGCTGGCAATGGGTCCCCATCGAAGGCGAA 
GACGCCTCCGGAAATGACCTTGTTGTTCGCACCTGGATCACCTGGAACCCCCAACCCACC 
CCCGCGGTGAAGGCCCTGATCCAGAAATTTATTGACGGAAAC 

>RXN025 0 6-downstream 
p TGAGTTCTAAACAGCCGCCATGA 

111 >RXN02553-upstream 

p TGGCGCAGGTGTTTGTGCCAGAAAATAACCGCGACGCTGATCATACGGTCAGCTTGCTGG 
ffi CGGAAGTACCACGGTGGTGGCGTTGGCGGATCACCTTCAA 

r\ >RXN02553 

2 ATGGCGGTGAAACGTAATGAGTTGGAACCCGAGCTGACGTCCAACCCCAACCCATTAAGC 
=^ GCAGAAGTGCATCATTTGTATCCTGAGGAAACTCGTCTTGCAACGGAGATCCTGGAACGC 

ACCAACAATTGGCTTGCTGAAAAAGGGATCCCTCCGCTGCCACCAGCGGAAGTTGTAGCC 
p ATCTCATTACACCTGGTTAATGCTGGTTTCCGCACGGAAGACCTCGCAGAAACCTACGTG 
|fi ATGACTGGCGTTTTCGAGCAGCTCTTTGAGGTAATCGATTCCTCGTTTGGCATCACCCTT 
m GACCGACAATCCGTCAACGCCGCACGGTTTATCACCCACATGCGCTACTTCTTTGTTCGC 
■ GTTCACCACGACGGACAACTCAACGACGGCATGTCCGTGCTGCGC/^CAGCCTAGAAATT 

TCCCACCCGGATTCGGTGGCATGTGCGGAAAGACTCAGCCAAATCCTCAGCCTTCGATTG 

GGTGCCGAACTTTCCTCCGACGAGC7VAACCTACCTCGCGCTCCATGTCGCGAGGTTGGCT 

G AAG A T CGAGGTACTACCGCT GAT 

>RXN02553-downstream 
TAACAAGTTCTAGGCGCGAATCT 

>RXN02 620-upstream 

GAAGCCCGTGAAATCGATACCAAATAGGAACTCTGCACAATTACTGGCTACAATCTCTTG 
AGATCAATAGGCCAAAACTTTAAGGAAGTAGAATTACGCT 

>RXN02620 

ATGGCAGGAGCAGTGGGACGCCCCCGGAGATCAGCTCCGCGACGGGCAGGCAAGAATCCT 
CGCGAGGAGATTCTTGACGCCTCTGCTGAGCTTTTCACCCGTCAAGGCTTCGCAACAACC 
TCCACGCATCAAATCGCTGATGCCGTGGGAATCCGCCAAGCCTCGCTGTATTATCACTTC 
CCGTCCAAGACGGAAATCTTCCTCACCCTGCTGAAATCTACTGTCGAGCCGTCCACTGTG 
CTCGCCGAAGACTTAAGCACCCTGGACGCCGGACCTGAGATGCGCCTCTGGGCAATCGTT 
GCCTCCGAAGTGCGTCTGCTGCTGTCCACCAAGTGGAACGTCGGTCGCCTGTACCAACTC 
CCCATCGTTGGTTCTGAAGAGTTCGCCGAGTACCACAGCCAGCGCGAAGCCCTCACCAAC 
GTCTTCCGCGACCTCGCCACCGAAATCGTCGGTGACGACCCCCGCGCAGAACTCCCCTTC 
CACATCACCATGTCGGTGATCGAAATGCGTCGCAACGACGGCAAGATTCCAAGCCCGCTT 
TCCGCAGACAGCCTCCCGGAGACCGCAATTATGCTTGCCGACGCCTCCCTCGCCGTCCTC 
GGCGCGCCGCTGCCCGCCGACCGGGTCGAAAAAACGCTTGAACTAATCAAGCAGGCTGAC 
GCGAAA 
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>FlXN02 620-downstream 
TAACCATCCGCGCCTGCGAAATC 

>RXN02 7 58 -upstream 

ATACATCTCACCCAATTCCCCATAACTAGACAATTGCCCAGCAACGACTGATAAGTCTCC 
AATGTCGTGTTCCGCGCTCAGACATGAGACAATTGTTGCC 

>RXN02758 

GTGACTGAACTCATCCAGAATGAATCCCAAGAAATCGCTGAGCTGGAAGCCGGCCAGCAG 
GTTGCATTGCGTGAAGGTTATCTTCCTGCGGTGATCACAGTGAGCGGTAAAGACCGCCCA 
GGTGTGACTGCCGCGTTCTTTAGGGTCTTGTCCGCTAATCAGGTTCAGGTCTTGGACGTT 
GAGCAGTCAATGTTCCGTGGCTTTTTGAACTTGGCGGCGTTTGTGGGTATCGCACCTGAG 
CGTGTCGAGACCGTCACCACAGGCCTGACTGACACCCTCAAGGTGCATGGACAGTCCGTG 
GTGGTGGAGCTGCAGGAAACTGTGCAGTCGTCCCGTCCTCGTTCTTCCCATGTTGTTGTG 
GTGTTGGGTGATCCGGTTGATGCGTTGGATATTTCCCGCATTGGTCAGACCCTGGCGGAT 
TACGATGCCAACATTGACACCATTCGTGGTATTTCGGATTACCCTGTGACCGGCCTGGAG 
CTGAAGGTGACTGTGCCGGATGTCAGCCCTGGTGGTGGTGAAGCGATGCGTAAGGCGCTT 
GCTGCTCTTACCTCTGAGCTGAATGTGGATATTGCGATTGAGCGTTCTGGTTTGCTGCGT 
CGTTCTAAGCGTCTGGTGTGCTTCGATTGTGATTCCACGTTGATCACTGGTGAGGTCATT 
GAGATGCTGGCGGCTCACGCGGGCAAGGAAGCTGAAGTTGCGGCAGTTACTGAGCGTGCG 
ATGCGCGGTGAGCTCGATTTCGAGGAGTCTCTGCGTGAGCGTGTGAAGGCGTTGGCTGGT 
TTGGATGCGTCGGTGATCGATGAGGTCGCTGCCGCTATTGAGCTGACCCCTGGTGCGCGC 
ACCACGATCCGTACGCTGAACCGCATGGGTTACCAGACCGCTGTTGTTTCCGGTGGTTTC 
ATCCAGGTGTTGGAAGGTTTGGCTGAGGAGTTGGAGTTGGATTATGTCCGCGCCAACACT 
TTGGAAATCGTTGATGGCAAGCTGACCGGCAACGTCACCGGAAAGATCGTTGACCGCGCT 
GCGAAGGCTGAGTTCCTCCGTGAGTTCGCTGCGGATTCTGGCCTGAAGATGTACCAGACT 
GTCGCTGTCGGTGATGGCGCTAATGACATCGATATGCTCTCCGCTGCGGGTCTGGGTGTT 
GCTTTCAACGCGAAGCCTGCGCTGAAGGAGATTGCGGATACTTCCGTGAACCACCCATTC 
CTCGACGAGGTTTTGCACATCATGGGCATTTCCCGCGACGAGATCGATCTGGCGGATCAG 
GAAGACGGCACTTTCCACCGCGTTCCATTGACCAATGCC 

>RXN027 58-downstream 
TAAAGATTCGTTTCTCGACGCCC 

>RXN02 910-upstream 

AACCTAGGCCGATACCCATGTGGAAATCTCGACGTCTTAAATGGACGATTGGAGCTAAAA 
CCACGAACAGCTGGGATTTTCCACGATAGGATTGGGTCTC 

>RXN02910 

GTGGAGATTCGTTGGTTGGAAGGCTTTATCGCGGTCGCGGAAGAATTGCACTTTAGTAAT 
GCTGCGATTCGTTTGGGGATGCCGCAATCGCCGTTGAGTCAGTTGATCCGGCGGTTGGAG 
TCGGAGTTGGGGCAGAAGCTTTTTGATCGCAGTACCCGGTCGGTGGAGTTAACTGCCGCG 
GGTCGGGCGTTTTTGCCACATGCCAGGGGGATTGTGGCGAGCGCTGCGGTGGCGAGGGAA 
GCTGTGAATGCTGCCGAGGGGGAGATCGTTGGTGTTGTTCGCATTGGTTTTTCTGGTGTG 
CTGAACTATTCCACGCTGCCGCTTTTGACCAGTGAGGTGCATAAACGGCTTCCTAATGTG 
GAGTTGGAGCTCGTTGGTCAGAAGTTGACGAGGGAAGCGGTAAGTTTGCTGCGCTTGGGG 
GCGTTGGATATTACGTTGATGGGTTTGCCCATTGAGGATCCAGAGATTGAGACTCGGCTG 
ATTAGTTTGGAAGAGTTTTGCGTGGTGTTGCCGAAGGATCATCGTCTTGCGGGGGAAGGA 
GTGGTGGATTTGGTGGATCTGGCTAT^GATGGGTTTGTGACGACGCCGGAGTTTGCGGGG 
TCTGTGTTTAGGAATTCCACCTTTCAGTTGTGTGCTGAGGCTGGTTTTGTGCCGAGGATC 
AGCCAGCAAGTTAATGATCCTTACATGGCGCTGTTGTTGGCGCGG 

>RXN02 910-downstream 
TAGTCAATCATGGGGGAGTATCC 

>RXN02 94 6-upstream 

CACGGATTTACCAAGACCACCACCCGCAACTCAGTTACATTGTTCAAATGTCCTAACACA 
TTTACATGAGCTTGTTGGGTGGGCAACGAAAGGAGACATC 

>RXN02946 

ATGACCACCGAAGCTCCCATTTGGCCAGCCGAACTCTTCGAAGACCTCGACCGCAACGGA 
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CCAATCCCCCTCTACTTCCAAGTAGCCCAACGCCTCGAAGACGGCATCCGCAGCGGAGTC 
CTCCCACCCGGAGCACGCCTAGAAAACGAGATCTCCGTGGCGAAACACCTCAACGTATCC 
CGCCCCACCGTCCGACGCGCCATCCAAGAAGTCGTAGACAAAGGCCTCTTAGTTCGCCGC 
CGCGGTGTTGGCACCCAGGTCGTCCAAAGCCACGTCACCCGCCCAGTCGAACTGACCAGT 
TTCTTCAACGACCTCAAAAACGCCAACCTGGACCCCAAAACCCGAGTCCTCGAGCACCGC 
TCCTTGCAGCAAGTTCCGCCATCGCAGAAAAACTCGGAGTTTCCGCAGGTGACGAAGTCC 
TCCTCATCCGCCGCCTCCGCTCCACCGGAGACATCCCCG 

>RXN02 94 6-downstream 
TAGCGATCCTGGAAAACTACCTC 

>RXN02 95 4 -upstream 

TTCGCCACACTCCCAATATCTACCAAAAATGGTGATCTATTATACATAATGGAATTACCA- 
AAGCTTCATATCACTTTTCCACAGCCTGAAAGAACATACT 

>RXN02954 

ATGTCCGCAGCTTTACCTCACACAGCAGCAGATCCCGTACACACCACCCCAGCGAAACCG 
CTGCTCGATCATGTCTTAGATTCACTAGGACGCAGCATCATCAGTGGTGAAATGGAAGCC 
GGTAGCACATTCAAACTGCAAGACATCGGTGAAAAATTCGGTATCTCCCGCACCGTCGCC 
AGAGAAGCCATGCGTGCCTTAGAGCAACTTGGGTTGGTGGCCTCATCGAGACGAATTGGT 
ATTACAGTGCTCTCGCACGAGCACTGGGCTGTCTTTGACAAAGCCATTATTCGCTGGCGC 
CTCGAAGATGAGCGTCAACGTGAACAGCAACTGCAGTCACTCACCGAACTTCGTATTGCC 
ATTGAACCAATTGCTGCACGCAGTGTTGCCCTTCATGCATCGAGCGCAGAGATTGCTATC 
ATCGGTGATCTTGCTGCACGAATGCGTAACCTCGGTGAAGCTGGTCGTGGCGCATCACAA 
GAATTCCTAGACGCAGATGTGAAATTTCATGAGCTTATTTTGCAGTATTGCCATAATGAG 
ATGTTCGCTGCCATGGCACCACCCATAAAAGCTGTACTAGTCGGGCGCACCACACTTGGC 
CTTCAACCCGATCGACCTGCCGAAGAAGTCTTGGACAATCATGATGCTCTCGCACACGCA 
CTAAGTGTTCGTAATGCAGACCTCGCCGAAAAAGCATCCAGGAGCATTCTGAATGAGGTG 
CGCGACGCACTGACCTCG 

>RXN02 95 4 -downstream 

T AAT T GCC ACT AAACGAG T C AC T 

>RXN02 990-upstream 

GAAGACTAAGCACCAGTTTTAACAAAGCAGGGACAATCCACACACTTAAACCATGATGTG 
GCTTGTTCCTGCTTTTTCGTCAACGAAGGGCAACAACGCG 

>RXN02990 

ATGGATATCCAAGCCGAAAAGATTGAAAAGCTCAGAAAAGCACTCGACAACTTTGAACGC 
GCTCATGCGCGAGGCGAATCAGACTTCTTTGACCATGAAAAAGAAGAAAAGAAAGCCAAC 
GTACGCAGACGTGCCCTGCTGCTGCTTAACCAACGCGCACGATCAGTCAACGAACTAAGC 
ACCAGACTTAAAGCACTGGAGTTTGAGGAAGACATCATCAATGAGGTCATTGGCGATCTC 
ACCAGATCCAAACTGCTTGATGATGAAGTTTTTGCCACTGAGTGGGTTCGGCAACGTGCT 
GCCAGGCGAGGAAAATCTTCGCGTGCGCTGGACCGCGAACTGCAGGAAAAAGGCGTCGAC 
AAGCAAACGCGTGCTGCGGCGCTTGAGCAAATCGACCAGGCCGATGAGCGGGACACGGCG 
CGGGCGGTGGCCGTGAAAAAGGCGCGCTCAGAGACCAAGATTCCGCAGGACCGCGCCGAC 
TACGACAAAGCGCTTCGGCGCGTGGTTGGTGCGCTGGCACGGCGGGGATTTCCGGCTGGA 
ATGTCCATGGACCTTGCGCGGGAAGCGCTAGACGCGCGAATCGAGGATTTGAAAAAC 

>RXN02 990-downstream 
TAAACCCCGGATGGGAATCATCC 

>RXN0302 3-upstream 

GGTATTACCCGAAAGTAATGTCGTTAATACTGTTTTTAATGGCTATAAAGAGGCATAGGG 
TTAGTTATATGAGTAACCAACCATCGGGATCGTCGCGACC 

>RXN03023 

GTGCCTCTGTATAAACAGATCGCTTCTTTGATTGAGGACTCCATCGTTGACGGAACCTTG 
AGCATTGATCAACGCGTGCCTTCTACTAATGAACTAGCCGCGTTCCATCGCATTAATCCC 
GCCACCGCACGCAACGGCCTGACCCTCCTTGTCGAAGCCGGCATCCTCTATAAGAAGCGT 
GGCATTGGCATGTTCGTCAGCGCCCAGGCCCCAGCACTCATCCGAGAGCGGCGAGATGCC 
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GCCTTCGCGGCTACTTATGTAGCACCGCTTATCGACGAATCCATCCACCTTGGTTTCACT 
CGTGCGCGCATTCACGCCCTTTTAGACCAGGTCGCTGAAAGTAGGGGCCTGTACAAG 

>RXN03023-downstream 
TAGCGCTTAAACCCTCTTGACCT 

>RXN03071 

ATGCTGGCGCCATGGCAGCTCCACAAAGACGACGACATCGTCGCCCGCAACGAGCAGATC 
ACCGAAGCCTTCGAGCGCGACGTCGTCCCATACGCGGAGCTTTTCGACGCCTCCGGCCAG 
ATTCCTTCATCGCAGGAGTTCTTCCGCGTGTCACTCACCGGACAGTATCTTCCAGACAGT 
GAGGTTTTGCTGCGCCTTCGCCCCGTCGACTCCGGCCCAGCATTCCAATCGTTAACCCCC 
TTCGAACTTGAAAACGGACAGATTGTCCTCGTCAACCGTGGTTACGAATCATCAGAGGGC 
ACAATCGTCCCAGAGATCGAGCCTGCTCCTTCACACCAG 

>RXN03071-downstream 
TAACCATCACCGGATTCGCCGCA 

>RXN0307 2-upstream 

ATCAGAGGGCACAATCGTCCCAGAGATCGAGCCTGCTCCTTCACACCAGTAACCATCACC 
GGATTCGCCGCAAGAACGAGGGCCTCCAGGTTCTGCACCT 

>RXN03072 

ATGGAAGACAGCGGCTACACCCAGGTCTACGGAATTAACACCGAACAGATCAGTGACGTC 
ACCGGCCTTGATCTTGGCACCGACTACGTCCAGGTCGCAGAAGGCGAACCTGGTGTTTTG 
AACCCAATGCCACTGCCTCAAATGGACCGCGGTAACCACCTCTCATACGGCTTCCAGTGG 
ATCGCCTTCGGCATCATGGCACCTTTAGGGCTTGGATACTTCATCTGGGCTGAAATGCGC 
GAACGACGCCGCGACAAAGCAGAACGCGAACAGATGGCCGAGCTAAACACTCTTGAACCA 
GTGGTGGAAACCCCTGAAGTTGTTGAAACTGCAGAACCAACCATCACCCCGGCTGCATCC 
AAACGACGTTCACGCTACGGCGATCAACACCGCAATCACTACGAGAAGATCTCCAAACGA 
GACCAAGAGCGCTTC 

>RXNO 30 7 2 -downstream 
TAAGCCCGTCTCATTTTTGCACC 

>RXN030 90-upstream 

TCAATTATCAGTACTATGACGATGACTGGGACGACGACGATGATGACTTCGACGACGACT 
GGGACGACGACTAACTAACCCCTGAGGCACTTTCTATTTC 

>RXN03090 

ATGGCTAAATCAACTCCTTTGATTGCATCGCTACGCTGGCGAATTGTCCTGTGGATGACA 
GCGGTTGTTTTCTTGACCCTAGCCAGCGTTGTGATCATTACCCGTTCGGTGCTGCTTTCA 
GAGGTAACCAACACCGCGAACTCGGCAGTTGAGCAGGAAATTGAGGAGTTTCGTCGCTTT 
GCAGCCGAAGGAATTGATCCAACAACTGCGCAGCCTTTTGAGTCAGGTCATCGCCTGATG 
GAGGTTTACCTGTCGAGGCAGATTCCGGATGAAAATGAAGCCATTGTCGGCATTTTCCCC 
GGAGAGCTCATTCAGGTTGATTACTCCCAGCTCAGTGGCGCCCATCCGCTTCCTTTGGAA 
CACTCCGATCCGTTGATTTCGGAAATCCGACAGACCACGCTGAATTCTGGAGTTTTCAGC 
GATCTTGAACGCGGAACCACTCACTGGGGAAAGGTGAATTTCCAAACTGCTTCCGGTGAG 
GCCGATGGTGAGTTCGTTGTCGCATTCTTCGCTGATAATCTTAAAGACCAGGTCAACGGC 
CAGATCCAGATTCTTATTTTGATCGGCACAGGGGGTTTGATTGCCTCAATTCTGATTGCT 
TGGTTGATTGCGGGCCAGATCATTGCCCCGATCCGCAAATTGAGTTCCGTGTCCGCAAAG 
ATCAGTAATTCGGATCTCACCTGGCGCGTCCCTGTGGAGGGTCGTGATGAGATTGCGCAG 
CTGGCCAGGACTTTTAATGCCATGTTGGATCGCATCGAAATCGCGTATAACGATCAGCGC 
CAGTTCGTTGATGATGCCGGCCACGAGCTGCGCACCCCGATCACAGTGGTGCGTGGCCAG 
TTAGAGCTTCTCGCCACCACCCCGCCGGAGGAACAAGCGCGGTCGATTGAGCTGGCCACC 
ACTGAGTTGGATCGAATGTCGCGAATGGTCAATGATCTGCTCACCCTCGCAGTCGCCGAT 
TCTGGCACCTTCATCCACGCCCACCCCACGGATGTCACGGATTTAACAATCGATATCGAA 
GACAAAGCCCGCACCATCAGCGACCGAATTTTGCTTGTCGACGCCCGCCCGAGGGCCTCG 
TCAGCCTCGACGAGCAGCGGGTCACCGAGGCAGTGCTTGGAGTTGTTCGGCAATGCGTTG 
CGCTACAGCGATGATGTGGTGGAGTTGGGTTCAGGATTTCAAGGGGTCTGGCCCCCACCG 
CATTTTTCGCATTTGGGTTCG 
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>RXN03090-downstream 
TGACAAAGGAAACGGTGTTGATA 

>RXN03100 

CTCTACGGCCAGGACAAAGTGACCTCCGATCCGATGGAAGCTGCTTACACTAGCCTCTAC 
CTCTGGAAAGAAATGGTAGAGAAGGCCGATTCCTTTGATGTCGCCGCAATTCAAGCAGCC 
GCCGACGGAACCACTTTTGATGCACCAGAAGGAACCGTGGTGGTTGGCGGCGATAACCAC 
CACATCTCCAAAACACCGCGCATCGGTCGAATCCGCCCGGATGGATTGATCGACACCATT 
TGGGAAACCGATTCCCCAGTTGATCCGGACCCATACTTGTCTTCCTATGACTGGGCCAAG 
ACCACCGCTGCGACTTCC 

>RXN03100-downstream 
TAAGAGATAAAAATCATGGACAT 

>RXN03 12 7 -upstream 

AGGTAGAACATCTGAAAGCACTTCAGATAAGACGGATGGGGTCTGATGGAAACGACAGTC 
GATATGATCAGAACCATCTCCAGATTAGGAAGTGAACACA 

>RXN03127 

ATGGAAAGCTCCAAAAAGACTTCGCGATCAAGGTCCACTACTCAAGAAGCAGTGCGCGAC 
ATTAAAAAATACATTCGGGACAACCGGCTGCGTACGGGAGACCTTCTTCCTTCCGAAGCG 
TTCTTATGTGAGGAATTGGGTTGTTCCCGTTCTGCGATCAGGGAGGCGATCCGCGCGCTC 
GTGACCTTGGACATCGTCGAGGTTCGCCACGGCTACGGCACTTTCGTGTCCAGGATGTCC 
CTCGAGCCCCTGATCAACGGGATGGTGTTCCGCACGGTGTTGGACAATGACACCTCGGTG 
GAAAACCTTTTCTACGTGGTGGATACCCGCGAAATCCTTGACCTTTCACTTGGCGAAGAG 
CTGATCGAGGTGTTCACCGACGATGACCGCGAGCTACTCCTTGATCTGGTGGACAAGATG 
CGCGAGCACAACGATCAGGGCGAATCCTTTGTGGTGGAGGATCAAAAATTCCACCGAGCA 
CTCCTAGCGCGAACGAAAAACCCGCTGATTAGAGAGCTCAACGATGCGTTTTGGCAGATC 
CAAACCGAGGCGCAGCCCATGCTCAATCTGGCTATGCCCGCAGACATCGACGAAACCATC 
AAAGCTCACAGCGACATCGTCGAAGCGCTCTCCAGCGGCAACATCGACGATTATCGCAGC 
GCCGTGCTCGCTCACTACGCGCCGTTTCGCCGCATGATTTCCAACATGCTCGATGCGCAC 

>RXN031 27 -downstream 
TAGCCTCATTGCGCGCGGGTTGT 

>RXN03136-upstream 

CTTATCGACGAGCTCCAAGGCTGGACCGTGGTACGTGCCACTTCCCTGTCGTGGCTGAAA 
TAAAAATCCCCGAAACCTCCTTGGACACATCGCCCACAAA 

>RXN0313 6 

TTGGGTGCGCACTCCGCCAACTCCATCCGTGGTGTGATCGACCGTCTCGATGCCTCCACC 
GTGGTGATCGTTGCCGATGTCCACTGGGCCGACGTGGAATCCATGCAAAAACTCATCGAA 
TATTCCATGCGCATGGTTTCTGGCCGTTTCGCACTCATCATGATTGGCCTTGATGAAGAG 
AACTTAGTGTTCCACGATGAGGTGGTCTCGCTCCCCTCCATCGCAGACTCCACCTACGTA 
TTGCCGCCGATGAGTATTGAAGAAATCCGCCAGCTTGCGCTTACCGATGTCCGCGGCCGC 
ATCAGCACCACCACCGCCACAGACATCCAGCGCATCACCGGCGGCATCTACGGGCGAGTC 
AAAGAAGTCCTCCACTCGGAATCCCCCGATCACTGGCGAATGCCCAACCCAAATATTCCC 
ATCCCACAAAGCTGGCATGCCAACCTGTTGAGACGCATCACCAACGAAGAAGTCTGGCAT 
GTACTACTCGCCGTCGCTGTCCTTCCCTCCGGAGGCCCCATTGACCTGGTAAAACTCATA 
GGCAACGACCCCACGGGCATGCTTTGCGACGACGCCGTCCGCTCAGGCCTGCTCCGCGTG 
CTGCCGTCTGACGGCCAACCACAAGTGGATTTGGTCCTGCCGATCGACCGCGCCGTACTG 
CAATCACGCACTCCGCTCAACATTCTGGCGCAGTTGCACCACAAGGCAGCCGAATATTAC 
GGCAAGTGGAATCAAAAAGATGCCCAACTGGAGCACGAAGCATTTGCTGCAATTGATCCA 
AATGATCCAGCAGTGCGAGCCCTAGCGCAGCGCGGATATGCGTTGGGTAGGACTGGCCAC 
TGGATGGAATCGGCACACGCCCTATCTCTTGCCGCGAACCGCACTGCACACCAAGAAGAA 
TCAAATAAGTACTTGCTGGAGTCCATCGATTCACTGATCGCCGCCGCCGATCTCCCCCAA 
GCTCGATCCAGAGCATCCACCCTTGATCTTGGAGAAACCGGCATTCAACAAGACTCAATG 
CTGGGCTACCTGGCAATCCACGAAGGCCGGCGCCTCGAAGCACGCAATCTCCTTCATCGT 
GCTTCTGAAGAATTGCTGGCGCAGCACCCGATTGATCCGATCCACGGCCCCCGCATGGCT 
CAGCGCAAAGTACTGTTAAACTTAGTGGACTGGAATCCAGAAGAACTCCTGGTGTGGGCT 
GATAGAGCAGTCGCATGGACTGAAGAGGATGCTGGCGAAAAGGTTGAGGCCCAAGCTATT 
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TCCCTCATTGGACAATCCATCCTCGATGGCTGCCTCCCCGAAGATAAACCCATCCCCGGT 
GAAACCACCCTTCACGCACAACGCCGCCACATGGCAATGGGCTGGCTTTCCATGGTTCAC 
GATGATCCAGTAACTGCACGTCAAAAGCTTGAACGTCGCACATCCATCAATGGTTCAGAA 
CGCATCAGTTTGTGGCAAGACGGATGGCTGGCTCGGTCCCTACTGCTGCTCGGCGAATGG 
GAGTCCGCAGCACGCACCGTAGAAATCGGTCTGGCCCGCGCCGAACAGTTTGGCATCCGC 
TTCCTCGAACCACTGTTACTGTGGTCGGGCGCCACAATTGCAACAGCCCGCGGAAACTCT 
GACTTGGCACGAAATTACATGAGCAGACTGTCCACCGATCAAGACTCCTTCATCGTCCAA 
TCTATGCCATCTGCGATGTGTCGCATGTGGGTCCACCGCCATAGAAATGAAATCCCCGGT 
GCGATCGTGGCCGGAGAACAATTGGAAAAAATCGCCGCACACAAACACGTCAACGCACCT 
GGATTCTGGCCATGGCAAGACGTCCACGCAACGCATCTCATCCGCATCGGCGAAACTGAG 
CGCGCCCAGGAGTTAGTGAACTCCACGCTTGAGGAGCTCAGAGGCTCCGATATCATGTCT 
GCCCACGCAAAAATTGCCGTTCCCGACGCCATGTTGATGATCCACCACGGAGATGTGAAA 
AAGGGATTTAAGCGTTTCGACGACGCCCTCGATATGATCGATCCCCTCACCCTCCCCTAC 
TATCGGGCACGCATCTGCTTTGAATACGGCCAGGCCCTGAGACGCCAGGGGCAACGTCGA 
CGTGCTGATGAACAATTTGCCCGTGCAGCTTCCCTATTCCAAGACATGGGCGCCGACGCG 
ATGGTCACCCTAGCCAACCGAGAACGCCGGGTGGGTGGCCTTGGTCAACGATCCGAGCAA 
GCCGGTGGGCTCACCCCTCAGGAATATGAAATTGCCCGATTAGTGTCATCTGGGCATGCC 
AACCGAGAGGTCGCACAGGAGCTTTTCCTCTCGCCTAAGACCGTGGAATACCATCTCACC 
CGGGTGTACAAAAAGCTCGGAATACGCAATCGGATGGAACTTGCCGAGGCTTTGAAGAAG 
TACTCACACGACGCC 

>RXNO 313 6-downst ream 
TAGCAGCGGATATGTTTGCGGAC 

>RXN0314 3-upstream 

TTCGTTGCGCGGGGGCATTTTTTAATCGGGCTCTGGTCGATTCTTGCAGTACGACCAAAG 
TCGGATTCGCGTTCATACTTAGTTGATCTATCGTGGTGGC 

>RXN03143 

GTGAAAACTAGCCAAGCGACCATCGCCCGAATTGAGAGAGTTCTCATTTGGGGATTGCAT 
TTACTCATTGCCGTTTTGTTGGTGTTGGTGTGTTGGCGTGCCAGCCATTGGGGTGTGTGG 
GTGCTCGCTTTTGGCTATGGCGTGGTTTATGTGGCGGGTGTGGTCCCGAATTCGCCGTTT 
AAGAATCACCCTATGGCGTGGTTTCTTGTGCTGAGTTTGTTGTGGGCGAGCCTGATTTGG 
GATGGACCGGAGCCTGCGTATTTGGTGTTTCCGATGTTTTTCCTCGCAGTGTTGATCACG 
ACACCGCTGAAATCCGCGATCATCATTGCAATACTGACGGCGATCGCGGTGGTTACGTTG 
GCTATGCACCTGGGGTTTTCTGTTGGCGTTGTCACCGGTCCGATCCTTGGCGCGTTGGTG 
GCGTGGGTAATGGGTACGTGTTTTCAGTTATTGGCACAAGCCTTAAAGGAGCTTGTCGAC 
GCACGTGCGTCGGCGATCCGGGCGTCGAAAAGCGCTGGCGAGCAGGCAGAACGAGCCCGC 
ATAGCGGGCGAAATACATGACACTGTGGCGCAGGGGTTGTCCTCGATTCAGATGTTGTTG 
CATGCGGCGGAAAAACGGGTGGATGATCCGCAGGCGTTAAGCCATATACGGTTGGCCAGG 
CAAACGACAGCTGATAATTTGGCGGAGACCAGGCAGATCATTGCTGCGCTGCAACCGACT 
CCACTCATTGGGGCGGATCTGCCGGTGGCGTTGGCCAGACTGTCGTCGACCACCCCGATG 
GGACAGAACATCACGTTTGAAGTCGACGGATCCCCACGGGTATTACCTGATGCGATGGAG 
GCAGAGATCGTACGAATTGCCCAAACGCTGCTGGGAAATGTGGTGCGGCATGCACAGGCA 
GATTCTGCAAAAATGACCCTGACATATCAAGATGATCAAATACTTCTAGATGTCATCGAT 
AATGGGCAGGGATTTGATGTGGCAGAAGTGATCCGTAAAAAATCCATTGGACTGCCCACA 
GCGCAACGCCGGGCTGAAGGGCTGGGCGGAACAATAATTATTGAATCTACAATCGGATCG 
GGAACTGGAATTTCCGCCCGTTTTCCCTATCCACAAAAGGACCAAGATAAG 

>RXN0314 3-downstream 
TGATCCGTATTCTGTTGGCTGAT 

>RXN03155 

GGATACCCACCACCGCCCACCGCCTCAAAAGACGCTGCGGGTGGGTTGCCACAACTGATC 
AGAGAGCTTCTCGACGCGACCCCCATCGATCATTGGTCCAACGATCGGCCTACTCTCACG 
CTGCCAGAGCATTGGGTGACAGACATCGACATTAAGAACCCTGTGCTTCGGGAAGTCGCC 
TCCCATCCCTTCTTCGATGGCTGCCCGATCGGAGATTTAGATGCCGATGCCTTTGTGGAG 
GATGGCACCCTCATTCACGAAAACGGGACTTTAAGATTCCGCAGCCCTGAGGAACGCACC 
TTGGTTCGGGCTTCTACTCCCCCATCGATGGCAAGAAGCCCGCGGGAGTGGGAATCGACG 
GAGGGAGGCGTCGATAAGCTAATTGCCGCAGGAAACCTGCCCCTGGCCCGACTGCATGTA 
GAGGAACTACCCCGTGCCGATGAGCAGCGCGCATTTTTGGCGCTGTACGGCGGGCAGTCG 
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TTTGAGGCGGCCTCGGCGTCGCCGTTTTATGCGCTGGCCACCTGGAATCCGGAGGCGTTG 
CGGGGCGATCCGACCTTCGATATGTTCGCCGATGCGCTAGACACTGGGCATTACAGGGAA 
GTCCCGCGTCCGGATGCCCCTGAAGAAAGCCAGATCCACGATTTCATCAGTGGCTGGCTG 
GCGTTGGTTTACGATGATCCCCTCACCGCCCGCCGTCTGCTCTCCAGTAGGGGCCCCTCC 
GATTTGGTGGGACTGTGGCAGTCGGCGTTTTTGGCGCGAGCGCACTACGTGCTGGGAGAA 
TTCCAAGAAGCCTCCGCCGTTGTCGAACGCGGCCTAGCCACCGGCGACCGCACCGGAGCC 
TCCCTACTCGAACCCGTGCACCTGTGGACCGGCGCCCT^GTCGCAGCCATGACTGGGCGC 
ACCGAATTGGCCAACCACTATTTACAGCGCCTGACCGTGCCCGACGATGCGTTCCTCATC 
CAAAAACTCAGCGCATCCATGGGCAAATTGATCACCGCATCCATGACCTCAGACACCCGC 
GCAGCAACCTTGGCCGGCGACCGCATGGCGTCGGTCGTATACACCACCAATACCCAGCAG 
CCCGGATTTTGGGCCTGGGAAGACATGTATGCGATCTCATTGATCCGAACGGGACGCATC 
GACGCCGCAGCCGCCGTCATGGATGGCATCCCTGACTCCACCATCCCCTCGCTGCGTGCC 
CGAAATTTGGTGCCCCAAGCAAACATCGAAATCCAACGAGGCTCCACAGCACGAGGCGTA 
AAAATGCTCTCCGAAGCCGTCGACCTCATTTCCTCCGTCAACATGCCAGCATATGAAGCC 
CGCATCCTCTTCGAATACGGGCTGGTTCTACGACGCATGGGCAGGCGCAGCCAAGCAGCC 
GAAATGTTCACCCACGCCGAAGAAGTCTTCACCGCCATGGGTGCGGTCACTCTGGCTGCC 
CGCTGCCACGGCGAACGACGAGTCGCAGGCGTTGGGCCACGCAGATCAGCGCAGGGACTC 
ACCCCTCAAGAGGAACAAATCACTGCGCTGGTTGTCGACGGCTGCTCCAACCAAGAAGTC 
GCCCGTGAGCTTTCCCTCTCCGCCAAAACGGTGGAATATCACCTCACGAGGGTGTACAAA 
AAGCTCGGGGTGAGCTCCCGTGGAGAGCTTCGAGAATTACTGAAGGTC 

>RXN03155-downstream 
TGACACAGCGTTGTTCAGCAGCT 

>RXN03181-upstream 

GTCGCAAAAGTCGCTGGCGTCTCCCCTTCCACTGTGTCGCGGGCGTTTTCGCAGCCTGGG 
CGAGTGAGTTTTTCCACTGCGGAGAAAATCCGCAACGCGT 

>RXN03181 

GTGGAAACTCAGGCTTTTCAGCGCCAAAACACCGGCCTCATCGCTATGGTTGCCGCCGAT 
GCGTCGAATCCCTTCTTCTTGGAAATTTTCCGGGGCGCGCAGCACGCCGCAAGCACTCAG 
GGCTATACGGTTGCGCTTGTCGACGCCCGGGAGTCGGCGATTAAGTCCAGGGAGGTGCTG 
GACAAGATCGTCCCCCACGCCGATGGCTTATTGCTCGCTGCTTCAAGGATGGATTCTGGT 
GAGATCCACAAAGTCGCGCGGGAAATTCCCACTGTATTAATGAGCCGTGAAGTGCAAGGT 
ATTCCCAGCGTGATGGTGGATAACTACGACGGTGCGCCGAAGGCTGTGGTGCATTTGGTG 
GATCAGGGGTGCCGCTCCATTACCTATATCGCCGGTCCTAATAAATCCTGGGCT 

>RXS0007 0-upstream 

CCACTCGTCCTCGACATACTTCTCCTGGCACTAAACGCAGGGGTTGACACATCTGGGTAGACTATCGAA 
G TAG AT TTTGTGTCATT G AGG AG GAT C AAC G 

>RXS00070 

GTGGGTATCAATCGCATCAGCCAAGGCTCTGCCCCGAAGCTGGGAGTGCGAAGCACCAGACAGCGAAAA 
GCCGTAATTGACGTTCTTGAGGAAATCGATAACTTCGCTTCCGCCAAAGAAATCCATCACGAGCTATCC 
ACCAGGGAACACAACGTCGGCCTCACAACCGTCTACCGAACCCTCCAATCCCTCGCCGACATCGGAGCA 
GTCGACGTACTTACCGTCACGGGTGGAGAAACTCTGTACCGCCAATGCCACGACGAGGGACACCACCAT 
CACCTGGTCTGCACCAATTGCGGTCGCACAGTCGAAATCGATGGCGGTCCAGTAGAGACATGGGCACAG 
GAAATTGCCACTAAAAACGGCTTTGCTCTCAGTAGTCACGAGGCTGAAATCTTTGGACTTTGCGCTGAT 
TGTAAGGAAAAAGTTACG 

>EOCS00070-downstreain 

T AG T T C AAG GAG AT AT G AAG C T G 

>RXS00133-upstream 

GTTACATCAGATGAGGATGCCCTATGGGTGTACACATGCGACGGGTGTATTGCAGGAGGAAATTTGAAG 
GTGGATACCCAGCGGATTAAAGATGATGAAG 

>RXS00133 

ATGCTATTCGTTCGGCGGCTGACATCGCTGAAAACCGCAACAGGCATCCCAGTCACCATGTTCGCCACT 
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GTGTTGCAGGACAATCGCCTGCAAATTACTCAGTGGGTTGGGTTGCGTACCCCGGCTCTGCAGAATCTG 
GTCATTGAACCAGGTGTGGGCGTTGGTGGACGCGTCGTCGCAACCCGTCGTCCGGTTGGTGTGAGTGAT 
TACACCAGGGCAAATGTCATTTCACATGAGAAGGATTCCGCGATTCAGGATGAGGGCCTTCATTCCATT 
GTCGCAGTTCCCGTGATCGTGCACCGCGAAATTCGTGGCGTTTTGTATGTTGGCGTTCACTCTGCGGTG 
CGTCTCGGCGACACTGTTATTGAAGAAGTCACCATGACTGCGCGCACGTTGGAACAAAACCTGGCGATC 
AACTCCGCGCTTCGCCGCAATGGCGTTCCTGATGGTCGCGGTTCCCTCAAAGCTAACCGCGTGATGAAT 
GGGGCGGAGTGGGAGCAGGTTCGTTCCACTCATTCCAAGCTGCGCATGCTGGCAAATCGTGTGACCGAT 
GAGGATCTGCGCCGCGATTTGGAAGAGCTTTGCGATCAGATGGTCACCCCAGTCCGCATCAAGCAGACC 
ACCAAGCTGTCCGCGCGTGAGTTGGACGTGCTGGCTTGTGTCGCGCTCGGTCACACCAACGTCGAAGCT 
GCTGAAGAGATGGGCATCGGCGCGGAAACCGTCAAGAGCTACCTGCGCTCGGTCATGCGCAAGCTCGGC 
GCCCACACGCGCTACGAGGCAGTCAACGCAGCACGCCGGATCGGCGCACTGCCT 

>RXS00133-downstream 
TAAAAAGATTTTGCTTTACGACG 

>RXS001 4 4 -upstream 

ATAGTGTCAGACAACAACCAGGAAACTGGTCGTTGCAGAGTTTTTGCAAAATTGGACATCCTTTAACGG 
ACCGCACAGAGAGGCGGGGAAGGAGGTCACG 

>RXS00144 

ATGAGCGAACGTAATAGTGCTGTACTAGAACTCCTTAATGAGGACGACGTCAGCCGTACCATCGCACGC 
ATCGCGCACCAGATTATTGAGAAAACCGCGCTTGATTCCAAAGACGCGGATCGGGTCATGTTGTTAGGT 
ATTCCCTCAGGTGGAGTCCCATTGGCCCGTAGGCTCGCTGAAAAGATCGAAGAATTTTCCGGCGTTTCG 
GTAGATACCGGCGCTGTTGATATCACCTTGTACAGGGATGATCTTCGAAACAAACCACACCGCGCACTG 
CAGCCCACCTCTATTCCAGCAGGTGGTATCGATAACACCACCGTGATTTTGGTGGATGATGTGCTGTTT 
TCCGGTCGTACCATCCGCGCTGCACTTGATGCTTTGCGCGACGTTGGACGCCCCAACTACATCCAGTTA 
GCTGTGTTGGTTGACCGCGGTCATCGCCAGCTGCCCATTCGCGCTGACTATGTGGGCAAAAATCTCCCC 
ACCGCACGCGCGGAAGACGTTTCCGTCATGCTTACAGAAATCGACGGCCGCGATGCAGTCACGCTCACC 
CGAGAAGACTCTGAAGGGGATTCC 

>RXS0014 4-downstream 
TAG AT G AAGC ACC T CC T AT C CAT 

>RXS002 05-upstream 

TGGGGGAGTGGGGATAAGTTCATCTTAAACACAATGCAATCGATTGCATTTACGTTCCTTATCCCACAA - 
TAGGGGTACCTTCCAGAAAGTTGGTGAGGAG 

>RXS00205 

ATGGCTTCCGAAACCTCCAGCCCGAAGAAGCGGGCCACCACGCTCAAAGACATCGCGCAAGCAACACAG 
CTTTCAGTCAGCACGGTGTCCCGGGCATTGGCCAACAACGCGAGCATTCCGGAATCCACACGCATCCGA 
GTGGTTGAAGCCGCTCAAAAGCTGAACTACCGTCCCAATGCCCAAGCTCGTGCATTGCGGAAGTCGAGG 
ACAGACACCATCGGTGTCATCATTCCAAACATTGAGAACCCATATTTCTCCTCACTAGCAGCATCGATT 
CAAAAAGCTGCTCGTGAAGCTGGGGTGTCCACCATTTTGTCCAACTCTGAAGAAAACCCAGAGCTGCTT 
GGTCAGACTTTGGCGATCATGGATGACCAACGCCTCGATGGAATCATCGTGGTGCCACACATTCAGTCA 
GAGGAACAAGTCACTGACTTGGTTAACAGGGGAGTGCCAGTAGTGCTGGCAGACCGTAGTTTTGTTAAC 
TCGTCTATTCCTTCGGTTACCTCAGATCCAGTTCCGGGCATGACTGAAGCTGTGGACTTACTCCTGGCA 
GCTGACGTGCAATTGGGCTACCTTGCCGGCCCGCAGGATACTTCCACTGGTCAGCTGCGTCTTAACACT 
TTTGAAAGACTATGCGTGGACCGCGGCATCGTCGGAGCATCTGTCTATTACGGTGGCTACCGCCAAGAA 
TCTGGATATGACGGCATCAAGGTGCTGATCAAGCAGGGAGCC7VATGCGATTATCGCTGGTGACTCCATG 
ATGACCATCGGTGCGTTGTTGGCTCTTCATGAGATGAATTTGAAGATCGGTGAGGATGTGCAGCTCATT 
GGGTTTGATAACAACCCAATTTTCCGGCTGCAGAATCCACCGCTGAGCATCATTGACCAGCACGTACAA 
GAGATCGGTAAGCGTGCGTTTGAGATTCTGCAGAAGCTGATCAATGGGGACACCGCGCAAAAATCTGTG 
GTGATTCCAACGCAGCTCAGCATCAATGGATCAACGGCGGTTTCCCAAAAGGCGGCCGCAAAGGCAGCA 
AAAGCAGCCCAAAAAGCAGCCGCGAAAGCCGCACAGAACACGCAACACGAGGTGAGCCTAGATGGTGAA 
CTC 

>RXS002 OS-downstream 
TGAACAAGCGCTTCATCAGCATG 

>RXS004 7 0-upstream 
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TCATAACCAGGTTGGGCAAAAGGGATGAATCCCTGGTTGTGGTGGGGCTCCTGAAAAGTACTCATAGAC 
TCTATTGTGGAGTGTTGAGGCTGATAAGTGA 

>RXS00470 

ATGGGGGAAAGCCCTGAAAAGGTGGCGTTCAGGGTCTTCCCTGATGGTTTGGTGTCGCAGGGGCATGAC 
ATGATCGAAGATATGAGTAACACACCTGCGCCTTATACCCCGCAGCCTGCGGGGCAAGCGGTGCCTTTA 
TATCCCACGTTTACCCGGTCAAGAGATGGTCGGGTTGTTGCGGGTGTCGCATCGGGGCTGGCAAAGCAT 
CTTAATGTGTCGGTGTTTTGGGTTCGTGCGCTGCTGATTTTTGCGGCGTTGCTGAGCGGTGCGGGTCTT 
TTTGCGTATGCCTTGATTTGGATTTTTACGCGCATTGAGAAAAAGGGGAGTGGGGAGGCGTCGACAAGC 
AAGCGCTGGGTGTCGTGGTGCCTGGTGCTGCTCGCTATCGGTGGTGCTGCGGCGTCGGTGATGCTGAGC 
ACCGGCTTCGCGGTGGGCACGTTGGTGCCCATCGGCGTGGTCGGTGTGGGCCTGTTGATGGTGTGGCTG 
GCGTATGACCGCGGGGTGGAATCCGGCCCGAATCTGCTGATTATTGCCACCGGCGGTGTGTTGATGCTG 
GTGGCGATCGTGCTGATCGTGATGAATTGGAACACCCAGGACGGCTTCGTCATGGCGCTGGTGGCCGTG 
GTGCTCACGCTGGTGGGTGTGGCTGCGCTGGGCGTTCCGCTGTGGGTGCGGATGTGGGATCAGCTGGGC 
GAGGAGCGCGCGGAAAAAGCCGCAGCTGCTGAGCGCGCAGATATTGCTTCCCGCCTGCATGATTCGGTA 
CTGCAGACCTTGGCGCTGATTCAAAAGCGTGCCGACGACCCCGCCGAAGTCGCCCGCCTGGCCCGCGGG 
CAGGAACGCGAGCTGCGTCAATGGCTGTTTGATTCCCAAGATAAAACACCTCAAACAACCGGCACTGTC 
TTTACTGCGTTGGAGCGCGCCTGCGGTGAAGTCGAGGATATTTACGCTCTGCGTATCGTGCCTGTGACC 
GTGGGAACCGATGAAGCGCTGACTGAGAAAACGCAGGCAGCGGTGATGGCAGTCCGCGAAGCACTCGTG 
AACGTGGCCAAGCATGCCGGCGTGGAAACCGCCGATGTGTACGCCGAAATTATGCTCGGCGAACTGAAC 
ATTTTCGTCCGCGACCGCGGTGCAGGATTCGACCCCGACAACATCCCCGACGGGCACCACGGGCTCGCC 
GAATCCGTCCAAGGCCGCGTCGAACGAGCCGGCGGAAAAGTACGCATCAAATCTGAAATCGGCGAAGGC 
ACCGAAGTGGCAATCACCATGGATGTG 

>RXS004 7 0-downstream 
TAGTTGGTCGTACGCGCGTGTCT 

>RXS004 71 -upstream 

ACGCATCAAATCTGAAATCGGCGAAGGCACCGAAGTGGCAATCACCATGGATGTGTAGTTGGTCGTACG 
CGCGTGTCTTCGGGGCTGTAACCTGAAAGGC 

>RXS00471 

ATGGTTGATGTGTTTTTGGTCGATGACCACTCCGTGTTTCGCTCCGGCGTCAAAGCAGAACTAGGCAAC 
GCCGTCACAGTAGTCGGCGAAGCAGGGACGGTGGCCGACGCCGTAGCCGGCATCAAGGCAAGCAAACCA 
GAGGTAGTGCTTCTCGACGTCCACATGCCCGACGGCGGCGGCCTCGCAGTGCTCCAGCAGATCAACGAC 
TCCGATGTGGACACCATTTTCTTGGCACTCAGTGTCTCTGATGCTGCGGAAGATGTCATCGCCATCATC 
CGTGGCGGTGCCAGGGGATACGTGACCAAATCAATCTCCGGTGAAGAACTCATCGAAGCCATCAACCGC 
GTGAAATCCGGCGACGCATTCTTCTCACCACGCCTGGCAGGCTTTGTCCTCGACGCCTTCGCCGCCCCC 
GATTCCGCAGCTGGCGCAGGCATTGTCGACGCACCCGAAAAAGACGCCGCCGTAGAATCCGGAAAAATC 
CTCGACGACCCAGTTGTCGACGCCCTCACCCGCCGCGAACTCGAAGTCCTCCGCCTACTAGCCCGCGGC 
TACACCTACAAAGAAATCGGCAAAGAACTGTTCATTTCCGTCAAAACCGTGGAAACCCACGCCTCAAAC 
ATTCTGCGGAAAACCCAACAATCCAACCGCCACGCGTTGACCCGGTGGGCTCACTCGAGGGATCTTGAC 

>RXS004 71-downstream 
TAATGGCGGCTAAAAAGAGTGGC 

>RXS004 81-upstream 

GTCCATAAAAATAATGTGCCTACAAGAAATTTATAGTATCCCATGAGTTAATATTTTTAAAAATAAACT 
TTATCTGACTTTGTAGAAAAAGGTGATTACT 

>RXS00481 

ATGCTGAATATGCAGGAACCAGATAAAATCCATCCGGCAGAACCTACACTTCGTAATATTTATGACGTT 
AAAACTAGTGATCCCAAAAGTGAATTAGTTGATCGTTCTGGCATGTCGGAAGAAGACATTGCGCAAATT 
GGGCGGCTAATGAAATCGTTGGCCAGTCTTCGCGATGTGGAACGTAGTATTGGTGAAGCCTCGGCACGT 
TATATGGAGCTAAGTGCCCCTGATATGCGAGCTTTGCACTATTTGATTGTGGCGGGCAATGCGGGCGAA 
GTGGTGACTCCAGGAATGCTTGGAGCTCACCTTAAGCTTTCCCCGGCATCTGTAAC7VAAGACGCTTAAT 
AGGCTAGAAAAAGGTGGGCATATTGTTCGTAATGTGCACCCCGTCGACCGCAGGGCTTTCGCCCTCATG 
GTCACTGATGCCACTCGTGGAGAGGCGATGCGGACGCTTGGTAAGCATCAGGCGCGTCGTTTTGATGCT 
GCTAAACGATTAACTCCACAAGAGCGTGAAGTGGTTATCCGATTCCTTCAGGATATGGCACAGGAGTTA 
TCCCTTAATAATGCACCATGGCTCAACACGGAG 
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>RXS004 81-downstream 
TAG AT GAG C AT C T AC G T T AAT T A 

>RXS0064 9-upstream 

GTATTTGATCTGTGGTGTGGCTGATTCGGGAGGACTCGATGACATTATGTGTATGGTACACATTTTGTG 
CAAGATGCAATAGCTGGCAAACTGGAGAGCC 

>RXS0064 9 

ATGAGCACCGACCCCATCGCGGCCTTGGAATACGAATCCACCATCTTCGCCCGTCACCGGAATCAATAC 
ACCGGCCAAGCAGGTACGAATGCTGGCGTCCTCGATTCCAGCGGCTACAACCTACTCACGCTGCTCCAG 
TTACGTGGCCCCTCCACCATCGGCGAACTCAGCGCCATCACCGGCCTAGACGCATCTACCCTTAACCGT 
CAGACAAAAGCCCTACTAACCAAAGGATTTGTCGAACGCATCCCAGATCCCGACGGTGGAATCGCTCGG 
AAATTCCACCCCACCGACCTCGGCAATGAACTGCTCAACGAGGAACGCACATCCAGCCAAGAAAAATAT 
GCCGAGTTACTTTCAGACTGGCCCGAAGAGGATCTACGCACCTTCGTCAAACTTCTTGAAAAACTAAAT 
AAAGCCGTGGAGACACGCGTCGGAAAGCATTGGCCGCGCCCC 

>RXS0064 9-downstream 

T GAG T GAG C C C AAGC C AG AGO C C 

>RXS00 65 0-upstream 

AAGGCTAGACTAAAGTACGATTCATCTGCTCATCGATACTCTTGAAGGCGCATTTTCATTCGAAACGAA 
GTGCGCCATTGGGAAGGACCTAGTTCAAACA 

>RXS00650 

ATGATTCGCGTGCTGCTTGCTGATGACCACGAAATCGTGAGGCTCGGACTCCGAGCTGTGCTGGAAAGC 
GCCGAGGACATTGAAGTGGTGGGCGAAGTCTCCACCGCCGAAGGTGCGGTGCAGGCAGCCCAAGAAGGC 
GGAATCGACGTCATCTTGATGGACCTCCGATTCGGCCCCGGCGTCCAAGGAACCCAGGTTTCCACAGGC 
GCAGACGCCACCGCAGCCATCAAGCGAAACATCGATAACCCGCCAAAAGTCCTGGTCGTGACCAACTAC 
GACACCGACACAGACATCCTCGGCGCAATCGAAGCCGGCGCACTGGGCTACCTGCTCAAAGACGCCCCA 
CCGAGCGAACTCCTGGCAGCAGTACGATCCGCAGCAGAAGGTGACTCCACACTGTCACCCATGGTTGCG 
AACCGCCTGATGACTCGCGTGCGCACCCCCAAAACCTCACTCACCCCACGTGAACTGGAAGTTCTCAAG 
CTGGTTGCCGGTGGATCCTCCAACCGCGACATTGGCCGTATCCTCTTCCTCTCAGAAGCCACGGTGAAA 
TCCCACCTCGTGCACATCTACGACAAGCTCGGCGTGCGGTCACGTACCTCCGCTGTCGCAGCCGCACGT 
GAGCAGGGGCTGCTG 

>RXS00 65 0-downstream 
TAGCGGGGGTTGCTGCAAGGCTT 

>RXSO 0 65 7 -upstream 

GTGCGGATCGGGTATCCGCGCTACACTTAGAGGTGTTAGAGATCATGAGTTTCCACGAACTGTAACGCA 
GG AT T C AC C AAT C AAT G AAAG G T C G AC C GAG 

>RXS00657 

ATGAGCACTGAAGACATTGTCGTCGTAGCAGTAGATGGCTCGGACGCCTCAAAACAAGCTGTTCGGTGG 
GCTGCAAATACCGCCAACAAACGTGGCATTCCACTTCGCTTGGCTTCCAGCTACACCATGCCTCAGTTC 
CTCTACGCAGAGGGAATGGTTCCACCACAAGAGCTTTTCGATGACCTCCAGGCCGAAGCCCTGGAAAAG 
ATTAACGAAGCCCGTGACATCGCCCATGAGGTAGCGCCAGAAATCAAGATCGGGCACACCATCGCTGAA 
GGCAGTCCCATCGACATGCTGTTGGAAATGTCTCCCGATGCCACAATGATCGTCATGGGTTCCCGCGGA 
CTCGGCGGACTCTCCGGAATGGTCATGGGCTCCGTCTCCGGTGCAGTGGTCAGCCACGCAAAGTGTCCA 
GTCGTTGTTGTCCGTGAAGACAGCGCAGTCAACGAAGACAGCAAGTACGGCCCAGTCGTCGTCGGTGTG 
GATGGCTCCGAAGTCTCCCAACAGGCAACCGAATACGCATTTGCGGAAGCTGAAGCTCGTGGCGCCGAA 
CTCGTTGCAGTTCACACCTGGATGGACATGCAGGTACAGGCATCACTTGCAGGTCTTGCAGCTGCTCAA 
CAGCAGTGGGATGAAGTGGAACGTCAGCAAACCGACATGCTGATCGAACGCCTCGCACCACTGGTGGAA 
AAGTACCCAAGTGTAACCGTCAAGAAGATCATCACCCGTGACCGCCCAGTTCGCGCACTTGCAGAAGCA 
TCTGAAAACGCGCAGCTCCTAGTCGTTGGTTCCCATGGTCGTGGCGGATTTAAGGGCATGCTCCTTGGC 
TCCACCTCCCGCGCACTGCTGCAATCCGCACCGTGCCCAATGATGGTGGTTCGCCCACCTGAGAAGATT 
AAGAAG 

>RXSO 0 65 7 -downstream 
TAGTTTCTTTTAAGTTTCGATGC 
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>RXS0068 6-upstream 

ATAGGCTTGAACAATACGTCGTTACACTGGCCGATTTGATACCTTTCAAAACTTTTACCCTTCATCGGA 
GTGCCAGGGGAACTTAGAGGAGCATTAAATA 

>RXS00686 

ATGGCGGGAGGAAATCGCGAACCTGGACGTACAGTCACCTCCAAGGTGATCGCCGTACTGGGAGCTTTT 
GAACACACCATGCGTCCACTTGGTGTCACTGAAATCGCTGAGCTGGCAGACCTCCCACCAAGTACCACC 
CACCGTCTCGTTTCTGAATTAACCG7VAGGCGGACTACTCAGCAAGAAATCTGATGGGCGCTACCAATTG 
GGCTTACGTATCTGGGAACTCGCCCAAAATACAGGACGGCAGTTACGCGACACTGCACGCCCGTTCATC 
CAAGAGCTCTACTCACTTACTTCCGAGACTGCGCAGCTAGTGGTCCGCGATAAAGATGAAGCACTTTTG 
ATTGACCGAGCCTACGGCACGAAGAAAATTCCACGCTCGGCTCGAGTCGGTGGTCGACTACCTCTGAAC 
TCCACTGCGGTTGGCAAGATTCTCCTTGCGTTTGATGAGCCATGGGTAAAACAGTCCTATCTCAAGCTG 
CCACTCAACGCCTCCACCCCi\AAGACAATTGTGAATCCCGACGTCTTGGCTGCGCAGCTGAAACAAATT 
CACTCGCAAGGCTTTGCCATCACACATGACGAGCAACGAATCGGCGGCGCATCGATCGCCGTACCGGTC 
TGGCATACAGGAAAACTGGGAGCAGCACTGGGGTTGGTGGTTCCCACCGCACAGGCTGCAAATCTTGAG 
CGCTATCTCCCGATCCTTCAGGCGACAAGTCAGAGAATTACAAAAGCAACCGCGCTCATTCCTTTGGAC 
ACACTTTTGGCTTCACACAAAAATGCAGAACGAAAAGGCGATACC 

>RXS0068 6-downstream 
TAAACCCGCCCTCCATCTGCATA 

>RXS00719-upstream 

CAGATGATGCACACATCGTGGACACCTCTGATATGACCATGGATCAAGTACTTGATCACCTCATCCACC 
TAGTGGAAGCCTCCGCTGAAAGGAGCAACCA 

>RXS00719 

GTGACTGATAAACACACCATGCCTGGTGAAGAGGACGACACCGTATTCGTCTACCACACCCACAAAGGC 
GAAATGGACGTCGAAGGTGCGTTTGCTGACGAAGAAGAACTAGCACCACACGGCGGTTGGGCTTCCGCA 
GATTTCGACCCAGCAGAATTCGGCTACGAAGACTCTGACGATGACTTCGATGCAGAGGACTTTGACGAA 
ACAGAGTTCTCCAACCCTGATTTCGGCGAAGACTACTCTGATGAAGACTGGGAAGAAATCGAGACCGCA 
TTCGGATTCGACCCAAGCCACCTTGAAGAAGCTCTCTGCACGGTCGCTATCGTCGGACGCCCAAATGTT 
GGTAAATCAACCTTGGTGAACCGCTTTATTGGACGTCGAGAAGCAGTCGTGGAAGATTTCCCCGGCGTA 
ACCCGTGACCGCATCTCCTACATCTCTGACTGGGGTGGACACCGTTTCTGGGTTCAGGACACAGGCGGA 
TGGGATCCTAACGTCAAGGGCATCCACGCATCGATCGCACAGCAAGCAGAAGTTGCTATGAGCACTGCC 
GATGTCATCGTATTCGTCGTGGACACCAAGGTGGGCATCACCGAAACTGACTCAGTGATGGCAGCAAAA 
CTGTTGCGCTCGGAAGTGCCAGTGATCTTGGTTGCGAACAAATTCGACTCCGACAGCCAGTGGGCTGAC 
ATGGCTGAGTTCTACAGCCTCGGCCTTGGCGATCCATACCCAGTTTCAGCCCAGCATGGACGTGGTGGC 
GCTGACGTTTTGGACAAAGTCCTTGAACTCTTCCCAGAAGAGCCTCGCTCCAAGTCCATCGTGGAAGGC 
CCTCGTCGTGTCGCCCTTGTGGGTAAGCCAAACGTGGGTAAGTCTTCACTGCTCAACAAGTTTGCTGGC 
GAGACCCGCTCTGTCGTGGACAATGTTGCAGGAACCACCGTTGACCCCGTTGACTCCCTGATTCAGCTG 
GATCAAAAACTGTGGAAATTCGTGGATACTGCTGGTCTTCGCAAAAAGGTCAAGACTGCATCTGGCCAC 
GAGTACTACGCATCACTGCGTACCCACGGTGCCATCGATGCAGCTGAGCTGTGTGTTTTGCTTATCGAT 
TCCTCCGAACCCATCACCGAGCAGGATCAGCGCGTGCTCGCAATGATCACCGATGCCGGTAAGGCACTG 
GTTATTGCGTTCAACAAGTGGGATCTCATGGATGAAGATCGCCGCATCGATTTGGATCGCGAACTTGAT 
CTCCAGTTGGCACACGTGCCTTGGGCAAAGCGCATCAACATCTCCGCCAAAACCGGTCGTGCACTGCAG 
CGCCTCGAGCCAGCAATGTTGGAAGCGCTCGACAACTGGGATCGCCGTATCTCCACTGGTCAGCTGAAC 
ACCTGGCTGCGTGAAGCAATTGCTGCGAACCCACCACCAATGCGTGGCGGACGTTTGCCTCGAGTGCTG 
TTTGCCACCCAGGCATCTACTCAGCCACCAGTGATCGTACTGTTCACCACCGGCTTCCTCGAAGCAGGT 
TACCGACGATACCTGGAGCGCAAGTTCCGTGAACGTTTCGGCTTTGAAGGCACTCCAGTGCGAATCGCT 
GTGCGTGTTCGCGAGCGCCGCGGCAAGGGCGGAAACAAGCAG 

>RXS00719-downstream 
TAAAGCTTGATTTTCCCTAAAAG 

>RXS00738 

TGTCAAGAGGAGACGGATGGCTTTTTTGATTTTGGGCGCGATATGCGGCCCGGTGAGCGCCGGTCGTAT 
GGCACTTTGCTTAACGACGCCACGACGCAGGTGTCGCACATCCTCGGCAATGCCTTCACCCGATCTGGG 
CTCAACGCTGAGTACGCGAATCTTTATGGTCAGGCGTTGGTGGGCATGGTGTCGATGACGGCGCAATGG 
TGGTTGGATGAGCGCACTCCGCCGAAGGAAGAAGTTGCCGCACATATTGTTAATCTTTGTTGGAATGGT 
TTGACGGGGATGGAAGCCGATCCGAAGTTAACTCCCATCAGTTCTGCTGAGGGTGCGATTTTTGGTCAA 
GAAAAGGAGAGTGAAGCG 
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>RXS 00 7 38 -downstream 
TGACACCTATGCTCGCGGGGCTG 

>RXS007 7 4 -upstream 

ATTATCGCCTTATAGTGTTGCCATGCGTACTGCATATAGACAGCAATTGGATGAATTCGCACACAATCT 
AATCATTTTGTGTGATCTAACTAAGGAGTGC 

>RXS00774 

ATGGATAAGGCGACTGATGCCCTCCTGCGCACTTCTTTGGCATCGGCAGAAAGCGCTTTAGGCAATGCA 
GAAAAGCTTGAAGAGCTTCGTACTGGATGCGAGTCTCAAGCCGTCGAACTTTTGGCGCTTGAAACTCCT 
GTAGCCCGTGATCTTCGCCAGGTTGTCTCCTCCATCTACATCGTCGAGGAAATTACCCGTATGGGTGCT 
CTGGCAATGCACGTGGCTAATTCCGTGCGCCGCCGTTACCCCGATCCGGTGATCCCGGAGGACATGCGT 
GGCTATTTCAAGGAGATGGCCCGCCTCGCAGCTGACATGACAGATCATATTCGTCAGATCCTCATTGAT 
CCTGAACCAGATCTTGCCCTAGAGATGGCTAAAAGCGATGACGCGGTGGATGATCTGCATCAGCACATC 
ATGCGTATTCTCACGCTGCGTCCTTGGCCTCACGACACCAAGAGCGCGGTTGATTTGACGCTGCTTTCC 
CGCTTCTACGAGCGTTACGCCGATCACACGGTAAACGTGGCCGCCCGTATCATTTACCTGTCCACCGGG 
CTGCACCCGGAGGAGTACATGGAAAAGCGCGAGCAACAAAGGGCCGATGCCGACATGGAGAAGCGCTGG 
GCCGAGCTGGAGCGGCAGTTCCGCACCAGCGAG 

>RXS007 7 4 -downstream 
TAAAAAGCTGCTTCTCGACGCTA 

>RXS0108 2-upstream 

GGCCTTGCTTGCGTTAGTTGCAGTGCTTCCTGAATATGCTTCTGAAACGGTTGTCGAGCACACTTATCA 
AACATCGGCGGCGAATTAAGAAGGTGAACAG 

>RXS01082 

TTGACGCAGTGGGGTAATTCGAATGTTGTGGAGGACTATCTCACAGCACTTTTCCGTGCAGAAGAATGG 
GATGAGGAACCAACAACAGGAAAACTCGCTGAAGTAATTGGAGTTACCGCATCAACGGTGTCGGCGACG 
CTCAAAAAACTCAACCCTGAGGGCTTCGTCAATTACCGTCCCTACGGGGACATCGAGCTGACGCCCGCA 
GGTCGAGACATCGCCATCAACGTGATCAGGCGGCGCCGGATCATTGAGACCTATCTGTCTGAGAAGCTT 
GGATTAGGCGCTCATGAACTACACGGCGAGGCAGATTTATTAGAGCACGCAGTGTCTCCACTGGTGTTG 
GAGAAGATGTTTCAGGCAGTGGGCTATCCAACGTTGGATCCTCACGGGGATCCCATCCCCACCGAATCT 
GGGGAGATGACCATCAATGATGGACTCATGCTTTTGGGACTAAAAGCTGGCGCATCTGCCACGGTTACA 
CGTGTTAGGGACGGAAACCCATCAGTGGTTCGGTACCTCACTGGAGTGGGAATTACCGTGGGCACAACG 
GTCACGGTCGTTGAAGCTCTTAGCGATATTGCCACACTGCGCCTGCAGATCGGGGAAATGTTTCAAGAC 
ATTCCCCTTGCAGTGGCAAACGCAGTGCGCGTATCACGT 

>RXS01082-downstream 
TAGTTCAGCGTGCCCAGCGCGCT 

>RXS01123-upstream 

AAAGAGCAGTGATTTTTCCCGATCCCCCCCTGGCCGCCTAGCGGAAATTATTGATACCGCGTGGCGGTT 
GGTGGAAACACGTGGCTGGGCGAATGTGAGC 

>RXS01123 

ATGCGAACCCTGGCCGCGGAGCTAAATATCAAGGCGCCGTCGCTGTACAAGCATGTAAAAACGCGCGAG 
GATATCGCCGCACACATCGCCACGAAGGCATTTATTCAGCTGGGGCAAAGCCTGCATGAACATTGTGAA 
AGTGTGGAGGATTTGCTTGCGGAATACCGCTCCATGGCTCGGGAAAATCCAAATATTTACCGGCTTCTC 
ACCAGTTCAGAGTTCCCCCGCGAGCTACTTCCAGAAGGCCTAGAAACTTGGGCAGGAACGCCATTCTAC 
CTGGTCACCGGCCACGATCCGATCAAGGGTCAAGCACTGTGGGCATTCGCGCACGGCATGGCCATCCTG 
GAAATCGACGCCCGATTCGCCGGCCCCAACAATGGATCCCCCGCGGATGGCGTGTGGGAGATCGGCGCG 
CGGGCATTTGACACACAAGTATTCGACCAAGGC 

>RXS01123-downstream 
TGAGCAAAAAGGCGCTAAGCTGT 

>RXS0118 9-upstream 

AGCCGGAATGACGCTCATTGTGTCCCGCGAAGACGACCAGTTCAGCGTCCGCATGCTCATTAATGCACC 
TGCAAATACACCTGCAGAAAAGGAAGCTTAA 
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>RXS01189 

ATGATTTCCATTTCCATCGCCGACGACGAAGCCCTGATCGCAAGCTCCCTGGCAACCTTGCTCAGCTTG 
GAACCCGATTTAGACGTCCGACCTACCGCAGGATCCGGTGAAGAACTCATTGAAACGTGGGCGGATCCA 
AGCAACCGAACCGATGTATGCGTCCTTGACCTTCAACTCGGAGGCATCGACGGCATCGACACCGCCACC 
CGGCTCATGGAAACCACCCCAGATTTGGCCGTGCTCATCGTGACCAGCCACGCCAGGCCCCGACAACTC 
AAACGCGCGCTTGCAGCAGGTGTTTTAGGATTCTTGCCCAAAACATCCACCGCAGATGAATTCGCeACC 
GCAATCCGCACCGTTCACGCTGGACGACGCTACATCGACCCCGAACTAGCCGCCATGACGATCAGCGCC 
GGTGAATCCCCATTAACCAACCGTGAAGAAGAAGTCCTCGAACTAGCAGGCCAAGGACTAAGCGCCGAA 
GAAATTGCGGTGGCAGCGCACCTCGCGCCGGGAACCACCCGCAACTATTTATCCCAAGCTATGACAAAA 
GTAGGCGCGCAGAATCGCTTTGAAGCGTTCACGCGCGCCAGGGAATTGGGCTGGTTG 

>RXS0118 9-downstream 
TAGCTTGTGGCTTATCTCCTATT 

>RXS012 4 2-upstream 

CGCCGGCAACCAAATGAGGCTTTTGGGCGTTGGACAGTGAGACAATGGGTAAGAAATTCGGACATATTT 
AGTAAATTGGCTTTTTGCTTTAAGGAGTGAC 

>RXS01242 

ATGTACGCAGAGGAGCGCCGTCGACAGATTGCCTCATTAACGGCAGTTGAGGGACGTGTAAATGTCACA 
GAATTAGCGGGCCGATTCGATGTCACTGCAGAGACGATTCGACGAGACCTTGCGGTGCTAGACCGCGAG 
GGAATTGTTCACCGCGTTCACGGTGGCGCAGTAGCCACCCAATCTTTCCAAACCACAGAGTTGAGCTTG 
GATACTCGTTTCAGGTCTGCATCGTCAGCAAAGTACTCCATTGCCAAGGCAGCGATGCAGTTCCTGCCC 
GCTGAGCATGGCGGACTGTTCCTCGATGCGGGAACTACTGTTACTGCTTTGGCCGATCTCATTTCTGAG 
CATCCTAGCTCCAAGCAGTGGTCGATCGTGACCAACTGCCTCCCCATCGCACTTAATCTGGCCAACGCC: 
GGGCTTGATGATGTCCAGCTGCTTGGAGGAAGCGTTCGCGCGATCACCCAGGCTGTTGTGGGTGACACT 
GCGCTTCGTACTCTCGCGCTGATGCGTGCGGATGTAGTGTTCATCGGCACCAACGCGTTGACGTTGGAT 
CACGGATTGTCTACGGCCGATTCCCAAGAGGCTGCCATGAAATCTGCGATGATCACCAACGCCCACAAG 
GTGGTGGTGTTGTGTGACTCCACCAAGATGGGCACCGACTACCTCGTGAGCTTTGGCGCAATCAGCGAT 
ATCGATGTGGTGGTCACCGATGCGGGTGCACCAGCAAGTTTCGTTGAGCAGTTGCGAGAACGCGATGTA 
GAAGTTGTGATTGCAGAA 

>RXS012 4 2-downstream 
TGATTCTTACAGTCACTGCAAGT 

>RXSO 1607 -upstream 

GGGCTGAAGGGCTGGGCGGAACAATAATTATTGAATCTACAATCGGATCGGGAACTGGAATTTCCGCCC 
GT T T TCCC T AT CC AC AAAAGG AC C AAG AT AA 

>RXS01607 

GTGATCCGTATTCTGTTGGCTGATGATCATCCCGTTGTTCGCGCAGGCCTTGCCTCCTTGCTGGTGAGT 
GAAGATGATTTTGAGATAGTGGACATGGTGGGCACCCCAGATGATGCCGTTGCGCGCGCCGCGGAAGGC 
GGGGTGGATGTGGTGTTGATGGATCTGCGTTTTGGTGATCAACCAGGCATCGAGGTCGCCGGCGGGGTA 
GAGGCAACGCGTCGCATCCGTGCGCTGGACAACCCGCCACAGGTACTGGTGGTGACCAACTACTCCACA 
GACGGCGATGTGGTGGGCGCAGTATCTGCTGGTGCCGTGGGGTATTTGCTCAAAGATAGCTCCCCAGAA 
GATCTCATTGCCGGTGTTCGCGATGCCGCGCGGGGAGAATCAGTGCTTTCAAAGCAGGTCGCCAGCAAG 
ATCATGGGGCGGATGAACAACCCCATGACTGCTCTCAGTGCCAGAGAAATTGAAGTGCTGTCCTTGGTG 
GCGCAAGGGCAAAGCAATAGAGAAATCGGCAAGAAACTTTTCCTCACTGAGGCCACGGTGAAAAGTCAC 
ATGGGGCATGTGTTCAACAAGCTGGATGTCACCTCTAGAACAGCTGCGGTAGCTGAAGCCAGACAGCGC 
GGAATTATC 

>RXSO 1607 -downstream 
TAGACGCACACGTGTTGGTAACC 

>RXSO 167 4 -upstream 

CGGCGTCGATTCCAGAAGGTTTGTAGACATGCTTCAAGGTTGCGCTAATTGAAAAGAACGCGGTAGACG 
GTACTTTCATATCCACCCATATAATGTTGAT 

>RXS01674 



Appendix A, page 61 




A] 



y Docket No.: BGI-123CP 



ATGGATAATGGGTGGCCGAACCTGCAAACTCTCGCACTCTTTGTGGCGATTGTGGAAGAGGGGAGCCTC 
GGTGCCGGTGCTCGAAAAGTCGGAATGGCCCAACCTAATGCCAGTCGGGCTATCGCAGAGCTTGAGGCA 
GACATGAAAGCCGAATTGTTGGTACGTCATCCTCGAGGATCACATCCAACAGCTGCTGGACTTGCGCTT 
GTTGAGCATTCGCGCGATCTGCTTCAATCTGTACAAGAATTTACTGAATGGGTGACAGAGGGACGAACT 
GAGCAGCCGCTGAAATTGCATGTTGGGGCCAGTATGACCATTGCCGAGGCTCTACTTCCAGCTTGGGTT 
GCGGACATGCGCACGCGTTTTCCTGCCTGCCGTGTCGACGTCTCTGTGATGAATTCTTCTCAAGTAATT 
GAAGCCGTCCAGAAAGGGCACTTGCAACTAGGTTTTATTGAAACACCGCATGTTCCCGTACGGCTTCAT 
GCTCGTGTGGTGCAAGAGGACAAGCTGATTGTGGTGATTTCTCCTAATCATGAGTGGGCTAATCGCACG 
GGTAGGATCAGTCTTCGGGAGTTGTCGGAAACTCCGCTGATAGTGAGGGAAGTCGGCTCAGGTACCCGA 
GAAGCATTACAAGAATTACTTGCGGATTATGACATGGCTGAGCCGATTCAAGTGTTAAACAGCAATGCT 
GCGGTACGTGTTGTTGTTGAAGCAGGGGCAGGTCCTGCAGTACTTGGTGAATTAGCCTTGCGTGATCAT 
CTTGCGCTCGGCAGGCTGTTGAGTGTGCCATTTGAAGGCAGTGGAGTTACTCGTCCTCTTACAGCTGTG 
TGGAGTGGACCCCGCAGATTGCCGATTCTAGCGGGAGAATTAGTGTCCATCGCATCGAACCACATC 

>RXS01 67 4 -downstream 
TGATTTTGAGCCCTGGCTAACGG 

>RXS01872-upstream 

ATTGCTTGGCTCATGGAGTTCATCATGCGCCAACAGCAAATATTAGTAAAATGTTAGAAATAGCTGTTT 
TTGATTCACTTTGTGCATGTAGGCTGTGACC 

>RXS01872 

ATGGGCAACGACGGCGGAGACCTGCGAATCGACGACCTACGCAGCTTCATTTCAGTCGCTCAATCAGGC 
CACCTAACCGAAACTGCCCAAAGATTAGGCATCCCGCAGCCCACACTTTCCAGACGAATCAGCCGAGTG 
GAAAAACACGCAGGCACCCCACTTTTCGACCGCGCCGGCCGCAAACTCGTCCTCAACCAACGAGGCCAC 
GCCTTCCTCAACCACGCCAGCGCCATCGTCGCAGAATTCAACTCCGCCGCAACTGAAATCAAACGCCTC 
ATGGACCCAGAAAAAGGCACAATCCGACTGGACTTCATGCATTCCTTGGGCACTTGGATGGTCCCCGAA 
CTTATCCGAACATTCCGCGCCGAACACCCCAACGTAGAATTCCAACTCCACCAAGCGGCAGCAATGCTC 
CTGGTAGATCGTGTTTTGGCTGATGAAACTGACCTCGCATTAGTTGGCCCCAAACCTGCCGAGGTTGGT 
ACCTCTTTAGGGTGGGCGCCACTGCTTCGTCAACGACTTGCCCTAGCTGTTCCCGCAGATCACCGGCTT 
GCCTCCTTTTCTGGCCAAGGAGAATTGCCGTTGATTACTGCGGCGGAAGAACCTTTCGTGGCGATGCGA 
GCAGGTTTCGGCACCCGACTCCTCATGGATGCATTAGCCGAAGAAGCCGGTTTTGTTCCCAATGTGGTT 
TTCGAATCCATGGAACTCACCACCGTCGCAGGGCTTGTCAGCGCAGGTCTCGGCGTTGGTGTGGTTCCG 
ATGGATGATCCCTACCTTTCCACAGTGGGAATCGTGCAACGCCCACTTAGTCCACCCGCTTATAGGGAA 

>RXS02117-upstream 

TGAACACAAACTACCACGTTTATTGCATCATGCAACACCCTTGCTAGGATATAAAATACTCTATGAGTC 
CAGACGTTTTTAAAGGGAGCGAATTACCATA 

>RXS02117 

GTGTCTACAGATCCAGAAGAGTTCGACCAAGCTGAAACCCTCGATCAACTCGCGTATGAGATCATCCTG 
CTCACCCGGTATGGTGTCCAAAACACACCGACCAACAAGCGCGAAGCCATCATGGATCGCAGCGCCCTC 
ATCTTGCTCACCCGCCTTGACGCTCAAGGACCTATGACAGTTAATGAGCTAGCTGAAAGCTTTGGACTT 
AACGTTTCTACCGTGCACCGCCAACTCAAAGCAGCCATTGCCAATGGCTTAATTGAAGTCGTCGATGAT 
CAAGCATGCCCCGCTAAACTTCATCGTCCAACTGAGTTGGGTAAAGAAAAACTGCAGCAGGAGCTTCTT 
GCCCGCCAGCAGGATCTCACCCGCATTCTTCATGATTGGGATGAGGAAGACATTAAAACGCATGCCAAG 
CTATTGCGGAAGCACAATGAAAGCTTGGAAGAATACCTCGATATGAAGTGGCCCCGCCCC 

>RXS02117-downstream 
TAAGTGCCCATAAACGCACCTCT 

>RXS0228 8-upstream 

AACAACAATCTAACGCCATCATGTTATAAAAAAGCAAGACCTAACATAAAAATGTTAGAAAGTGCTGGA 
TCTAACAACATTTCCGTGGTAACTTTTTCAC 

>RXS02288 

ATGTCCCAAGTGATTCCCGCCAGCTCACAAGAAAAGCGTCGTGAGCGCATCGTTTCTTATGTCACCCGT 
CATGGATTCGCTCGTGTTGAAGCATTAGCTGAGCTTTTTGAGGTCAGCGCAATGACCATTCACCGTGAT 
TTGGAGGCGCTGGCTGCAGACAATTTGGTGGAGCGCATTAGGGGTGGCGCGCGTTCGGTGTCGCCGTCG 
ATGAGTGAGTTGGCAGTGGAGCAGCGTCGGCATTTGCATCGCACTGTTAAAGAGGCGTTGTGTACTGCA 
GCAGCACGGTTGATTCCGGAGGGCGCTGTGGTGGCGATTGATGATTCCACCACGTTGGAGTCTTTGGTT 
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GAGAAGTTGCCGCAGCGGTCACCATCGGCGTTGATTACGCATTCTTTGAAGACAATGGCGGATCATCGT 
GTGCGCGCCGGGATGAGCGATATCCGTTTGATTGCGTGTGCGGGATTGTATTTCGCGGAGACTGATTCT 
TTCTTGGGCAAGGCAACTTCAGCGCAGTTGAATGAGCTGTCGGCGGATATTTCTTTTGTTTCTACGACT 
GCGGTGCGCGCTACGGGGGAGGTTCCGGCGCTGTTTCATCCTGATATGGAGGCTGCTGATACGAAGCGG 
GCGTTGATTGGGATTGGTAGCGTGCGTGTGTTGGTGGTGGATTCTAGTAAATTTGGTTCGGCTGGTGTG 
TTCAAGGTTGCTTCGATTGAGGAGTTTGACCACATCATCATTGATCAGCAGTGCACCCGTGAGCAGCGG 
GATCTTTTGCGTAATTCGCGCGCGCAGATCCATGTGATTGACCACAATGGTGATGAAATTTTGGATACC 
CCAACGGAAGAGGATTTT 

>RXS022 88-downstream 
TAAGATGGCTTTGGTTCTTGGAA 

>RXS0257 3-upstream 

CTTCCACACGGCTCTAACTATAGGAAAAGACGGCAGCAAAGCATTAATCGTCGGTAGGCTGAAGTGAAG 
TACTTTCCGAAAGATTCACAGGGAGCATGCA 

>RXS02573 

ATGACGAACAAAACCATGCTGGTTGCTTTTGATGGCTCACCGGAATCCCGGCGCGCTTTGGAATATGCG 
GCGAAATTGTTGCAGCCGCGCACCGTGGAAATTTTAACTGCGTGGGAGCCATTGCATCGGCAAGCTGCG 
CGCTCGGTTTCGTTGATCACCTTGGGGGTGGAACCCGAAGACCCCGCCCATTCCGCTGCACTAAAAACC 
TGCCAGGAAGGCGTAGAGCTAGCCCAATCTCTAGGTCTGGAAGCGCGAGCCCACATGGTGGAATCCGCA 
ACGGCCGTGTGGAGCGCCATCGTTGATGCTGCTGACGAGCTCCGCCCCGACGTGATTGTCACCGGCACC 
CGCGGGATCTCCGGATGGAAATCCCTGTGGCAATCCTCCACCTCAGACAGCGTGCTCCACCACGCCGAC 
GTACCAGTTTTTGTCGTTCCACCCCTGGAC 

>RXS0257 3-downstreain 
TAAAACCGAGACGAGAACCAAGA 

>RXS02627 

GATGTCACTGTGGAAAGCCAACCAGAACGCGTCGTTGCCCTGGGTTGGGGAGATGCTGAGGCTGCGCTG 
GAATTCGGTGTGCAGCCTGTGGGTGCATCAGATTGGCTCGCATTCGGTGGTGAAGGCGTGGGACCGTGG 
ATTGAGGATTCTGCCTACGATGAAGCGCCAGAAATAATCGGAACCATGGAACCGGAGTATGAAAAGATT 
GCAGCGCTTGAACCGGATCTGATTTTGGACGTGCGCAGCTCTGGCGACCAGGAACGCTATGACAAGTTG 
TCTTCAATCGCACTGACCATCGGCGTTCCAGAAGGTGGCGATAGCTACCTCACCCCACGCGCTGAGCAG 
GTAACCATGATCGCCACTGCTCTGGGGCAGGCTGAACGTGGTGAAGAAGTGAACGCTGAATACGAGCAG 
CTCACTGCTGATATTCGTGCAGCTCACCCGGGCTGGCCTGAGAAGACCGCGGCTGCTGTATCTGCAACG 
GCAACCAGCTGGGGTGCATACATCAAGGGCTCCAACCGTGTAGATACTTTGCTGGACCTGGGCTTCCAG 
GAAAACCCTGAGCTGGCTAAACAGCAACCTGGCGATACGGGTTTCTCCATCAAATTCAGTGAAGAGACT 
TTCGGCGTTGTGGATTCCGACCTGGTTGTCGGCTTTGCCATCGGTATGACTCCTGAGGAAATGGCAGAG 
CAGGTTCCATGGCAGATGTTGACCGCCACTCGTGACGGCCGTTCCTTTGTGATGCCCCGTGAGATTTCC 
AATGCGTTTTCTTTGGGTTCCCCGCAGTCCACTCGGTTCGCGTTAGACGCCTTGGTGCCACTTCTGGAG 
GAGCATGCAGGGGAG 

>RXS02 62 7 -downstream 
TAGTGGTCCGGTGGTGCGGGCAG 

>RXS02 6 91 -upstream 

CTTAGGGCTTATCTGTTTTCCAGCCTTGCTTTTTACTAGGCGCTCCTGTCCCGCTTCAGTCACCAAAAC 
CACACCCCTGGTTATGACCAGATCGGCTAAA 

>RXS02691 

ATGAACACCATGCCTGACCAACCGCTCAACCAGGACGGATTCCCTACCGCATCCAAAGGGGTGGAACCC 
GACAACCTCCCCGACCGCGTTCTCGTGGACGGCCTTAAACCAAAGCATCAGCAGCTTCGTGAAATTTTG 
GAGGAAATCTGCACCACCCAGCTTCAGCCTGGGGACATGCTGCCTGGTGAGCGCATCCTGGAAGAAAAG 
TATGGCGTCAGCCGAATTACGGTTCGTCGGGCGATTGGTGATCTGGTCGCGTCCGGCAGGTTGAAGCGA 
GCTCGCGGCAAAGGTACCTTCGTGGCCCACTCGCCGTTGATTTCCCGCCTGCATTTGGCCTCGTTTTCC 
GCAGAGATGGCCGCCCAGAAGCTATCGGCTACCAGCAGGATTTTGAGTTCTTCCCGCGGTCCCGCCCCA 
GATGATATTGCTGATTTCTTTGGTACCGATCGCGCGGCCCAGCACATCACGTTGCGCCGCCTGCGCTTT 
GGAAATGGTCGACCCTATGCCATTGACAACGGTTGGTACAACTCCGAATTCGCACCTGACCTGCTGGAA 
AATGATGTGTACAACTCCGTGTACTCCATCCTGGACCGCGTCTATGGCGTCCCCGTCACCCAGGCCGAG 
CAAACGGTCACCGCCGTAGCAGCCGACGAAGACACCGCACGGCTTCTGGACGTCACCCCCGGCGCCCCA 
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CTCCTTCGTATCCTTCGACAGTCACTTTCTGGCGATAAGCCCGTGGAATGGTGCGTTTCCTTGTACCGA 
ACCGACCGATATTCTTTAAAAACATTGGTTACACGCTCCGAAGATCTC 

>RXS02 691-clownstream 
TGACGTGAACCCATTTTGGTGGC 

>RXS027 30-upstream 

CCAACATCGCCTTGCACGTAATAGGTTAAAACACAAGTGAATGTAATCGTTTGCAGCAATCGATTACAT 
AAAGGTAGATAATGAGATAAAGCGAGGCGCT 

>RXS02730 

ATGGCGACGGAAAAATTCCGACCGACTCTTAAAGATGTCGCTCGTCAAGCAGGTGTCTCCATCGCCACA 
GCATCACGAGCACTAGCGGATAATCCGGCGGTTGCTGCATCGACTCGTGAAAGAATCCAAC7y\TTAGCC 
TCTGATCTGGGTTACCGGGCCAATGCTCAAGCTCGTGCGCTTCGCAGTTCTCGCAGCAACACCATTGGT 
GTGATTGTTCCCAGTTTGATTAACCATTACTTCGCCGCAATGGTTACTGAAATTCAAAGCACCGCCAGC 
AAAGCTGGACTTGCCACGATTATCACCAACAGCAATGAAGATGCGACCACTATGTCTGGGTCTTTGGAG 
TTTCTCACCTCGCATGGTGTCGATGGAATCATCTGCGTACCTAATGAGGAATGCGCGAATCAACTAGAG 
GACTTGCAGAAGCAAGGAATGCCAGTGGTGTTGGTTGACCGAGAGCTTCCAGGAGACTCCACCATCCCA 
ACGGCGACCTCTAACCCCCAACCAGGAATCGCCGCAGCAGTAGAACTCCTGGCTCACAACAACGCGTTG 
CCGATTGGTTACCTCTCAGGTCCCATGGACACCTCAACAGGTAGAGAGCGATTAGAGGATTTCAAAGCA 
GCCTGCGCCAACTCCAAAATTGGCGAACAGCTCGTTTTTCTGGGTGGGTACGAACAAAGCGTTGGATTT 
GAAGGCGCTACGAAATTGCTCGATCAAGGAGCTAAAACTCTTTTTGCCGGCGATTCTATGATGACGATC 
GGTGTCATTGAAGCCTGCCATAAGGCTGGTTTGGTTATCGGCAAGGATGTCAGCGTGATTGGTTTTGAT 
ACACATCCGCTTTTTGCCCTGCAACCTCATCCGTTGACAGTGATTGATCAAAATGTAGAACAACTAGCC 
CAACGAGCAGTGTCTATCCTCACCGAATTAATTGCAGGCACGGTACCTAGCGTGACGAAAACTACGATC 
CCCACTGCCCTTATTCATCGTGAATCAATCATCAACTCCACTTTAAGGAAGAAGGATGGACTCCCCAAT 
GAG 

>RXS02 7 30 -downstream 
TAACTCAACCGGTACCGACATTG 

>RXS02818 

TCCTATTCCCGGAAGTTTTTGACCCAGGTGTGGATTCGAGACAATGTCGGCGATTATAAAGGCCTTACC 
GATACGGCGTTCCGTAAGAAGCTGCAGCGCGATCTTGCCTACCTGCGCAGAGTTGGCGTTCCGATTGAG 
CAGTTCACGGTCACCTCAGGCATAGCTGAAGGCCAGCAGGCGTACCGTCTGGCCCAGGATTCTTATAAG 
CTCCCCGAGGTCGAATTCACCCCAGATGAGGCCGCCGTGCTGGGCATGGCAGGGGAGATGGGCCATAAT 
CAGGAACTCGGCGCCTTCGCGCGTTCGGGGTGGACCAAATTGGCGGCCGGCGGCGCGCAGCGTGATCTG 
TCCACGTCCACAGCCTTGACCAATGCGGGCGATTTAGGTTCCTTGTCTGCAAAAACCCTCGATGCGATC 
ATCAAAGCCCGCCAATTGGGCAAGCAAATCAGCTTCGAATACCGGCGCGCCCCCAAAGACGCCCCCTCG 
CTTCGACACATGGATCCTTGGGGTCTGGTCCCTGAGCGCGACCGCATCTACCTGGTCGGATTCGACCTC 
GACCGCCAAGAAGCACGCACCTTCCGCATCACCCGCGTCCGCAACATCAAACTC 

>RXS02 911-upstream 

ACCGCATACATTAACGTGTTGATCATTGCCCTAGTATGCGCAGTAGCGGCTGCTCTGATCAGCAGTTAC 
CTTTTCCGCGGAAATCCGAAGGGAGCCAATA 

>RXS02911 

ATGCGCACTAGTAAAAAAGAGATGATTCTGCGCACGGCCATCGATTATATCGGCGAGTACAGCCTCGAG 
ACGCTGAGTTACGATTCGCTCGCCGAGGCGACCGGTCTGTCCAAGTCGGGCTTGATTTATCATTTCCCC 
AGCCGCCATGCGCTGCTTTTAGGCATGCACGAGTTGCTTGCCGACGACTGGGACAAGGAATTGCGCGAC 
ATAACCCGCGACCCAGAGGATCCACTTGAGCGATTGCGCGCCGTCGTGGTTACGCTTGCTGAAAACGTT 
TCGCGCCCCGAGCTGGTTTTGCTTATGGACGCCCCCTCCCACCCGGGATTTCTTAACGCCTGGCGCACT 
GTAAATCATCAATGGATCCCCGACACCGATGATCTGGAAAACGATGCCCACAAACGCGCCGTCTACTCT 
GGTGCAGCTCGCAGCCGATGGCCTCTTCGTGCACGATTACATTCA 

>RXS02 911-downstream 
TGATGATGTCCTCAGCAAGTCCA 

>RXS030 66-upstream 

AACTGTTGGCATGGCGAAGTACAATGTTCGTGCAACTGGTCACGTGGAGCGCATCGTCCGCGAAATCAC 
CGCGGCGTAATAGCACCAGCTTAAAAACCTT 
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>RXS03066 

ATGACATCAGACAAAGACACTGAACAATTGGAAGCGGCAGGCACTGAAATTTTAATGCCTCGCCGCCGT 
CCGGCACAGCAGCGCAGTCGTGAACGATTCAATCGAATCCTCACCGCTGCGCGTTCAGTGCTTGTCGAT 
CTAGGTTTTGAATCGTTCACGTTTGATGAAGTCGCTAAGCGTGCAGAGGTACCGATCGGCACGCTGTAC 
CAATTCTTTGCCAATAAGTATGTATTGATCTGCGAATTGGATCGTGTGGATACCGCAGAAGCTGTCGCG 
GAGTTGAAGAAATTCTCCGATCAGGTTCCTGCGTTGCAGTGGCCGGATATCCTTGATGAATTCATTGAG 
CACTTGGCTAGGCTCTGGCGCGATGATCCGTCTCGGCGGGCCGTGTGGCATGCCATCCAGTCCACGCCG 
GCAACTCGTGCGACAGCTGCGGCGACGGAAAAAGAGATGCTGGAAATCATCGCGGAAGTTATGCGCCCG 
CTTGCCCGCGGTGCCGGCTACGAGGAGCGCATGTCACTGGCGGGATTGCTGGTGCACACGGTAAGTTCC 
CTGCTTAACTATGCCGTGCGTGATGTCAATAGTTCCGAAGAGGATTTCGACAGCATCGTGGT^GAAATA 
AAACGAATGCTGATTTCTTACCTCTTCTCCGTGGCTACTGGA 

>RXS030 66-ciownstream 
TAGTCAACACGCACGTTCCACCG 

>RXS03200 

GAGAAGTTGCTGCCATTCGCCAAATCCACCCTTGACGCGGCGGAGTCTTTCCTCTCCCACGCCAAGGGC 
GCCAACGGTTCGCTCACTGGACCGTTGACCGTAGGCATCATCCCCACGGCGGCTCCTTACATTTTGCCG 
TCAATGCTGTCCATCGTGGATGAAGAATATCCAGATCTGGAACCTCACATCGTCGAGGACCAAACCAAG 
CATCTTCTCGCGTTGCTGCGCGACGGCGCCATCGACGTCGCCATGATGGCCCTGCCTTCTGAGGCACCA 
GGCATGAAGGAAATCCCCCTCTACGACGAAGACTTTATCGTCGTTACAGCTAGCGATCACCCCTTCGCC 
GGCCGCCAAGACTTAGAACTATCCGCCTTAGAAGACCTCGATCTGCTGCTTCTCGACGACGGACACTGC 
CTCCACGACCAAATTGTGGACCTGTGCCGCCGCGGAGACATCAACCCCATTAGCTCCACTACTGCTGTC 
ACCCGCGCATCCAGCCTTACCACCGTCATGCAGCTCGTCGTCGCCGGCCTTGGATCCACCTTGGTCCCA 
ATCAGCGCAATCCCATGGGAATGCACCCGACCAGGACTGGCAACAGCCAACTTCAACTCTGATGTCACC 
GCAAACCGCCGCATTGGATTGGTGTACCGTTCCTCTTCTTCTCGCGCCGAAGAGTTCGAACAGTTTGCA 
CTCATTTTGCAGCGCGCTTTCCAAGAAGCCGTCGCGCTTGCTGCCTCAACTGGCATCACCTTGAAG. 

>RXS03208 -upstream 

GTGTTTTAAGAAGTGTTTTTAAGAGAATACGCATTGAAGTAGTTTTCCCTGCTGGCAGCGGCATAAATT 
GAGTTTGGAAAAACAAGGAAGGCAGCCTCCT 

>RXS03208 

GTGAAGGATCTGGTCGATACCACCGAAATGTATCTGCGCACT 

ATTTACGAGCTGGAAGAAGAGGGCATTGTTCCTCTGCGTGCTCGTATCGCAGAACGCCTTGAGCAGTCC 
GGCCCAACTGTCAGCCAGACTGTCGCCCGTATGGAACGCGACGGTCTTGTGCACGTCAGCCCCGACCGC 
AGCCTCGAAATGACTCCAGAGGGACGTTCCCTCGCCATCGCCGTGATGCGTAAGCACCGCCTAGCAGAA 
CGCCTCCTTACCGACATCATCGGCTTGGACATCCACAAAGTCCACGACGAAGCATGCCGCTGGGAGCAC 
GTGATGAGTGATGAGGTTGAACGTCGCCTCGTTGAAGTTCTTGACGATGTGCATCGCTCCCCTTTCGGT 
AACCCAATTCCTGGCCTCGGCGAAATCGGTTTGGATCAAGCAGATGAGCCTGATTCCGGCGTTCGTGCC 
ATCGATCTG 

>RXS03219-upstream 

CGTGTTTAAATCTAGAAGTTTAAAGGGTGAAAACAGTCCATTACTTAAGCACCAATCTGCCATAATTTT 
TACCCCAACGCATAGGCTTAACGGTGTGAAT 

>RXS03219 

GTGAAGTTAACTGACGCCGCCCGTGAAGCTGGAGTAGGTTACGGTACTGCTTCTCGCGCAATTTCTGGA 
CGAGGTTCCGTTGATGCAGCAACCCGTGACAT^GTACTCGCCGCCGCCGAGAAACTTGGGTACCGAACC 
AACGCCATGGCTCGTGCACTTAGGGA7\AACAAGACCCGCACCGTTGGCCTGATCGTTCCCGGCATTATC 
AATAAG 

TTCTACACCGAATCCGCCACTGTCCTCCAAGATGAATTAGACAAATCCGGATACCAACTAGTTGTTTCC 
ACAACTGGAAACGACGCAGAAAAGGAACGTCGAGCTATCGAATCCATGCTCAACCGCCAGGTAGATGCA 
GTGGTGCACGCTCCAGTTAATCCCCAAGCGAAGTTTCCAAAGGGCTTCAAAGTGGTCGAGCTTAATCGT 
CGTAGC 

GATCTCAACCGACCTACTGTGACCAGCGATGATGCCACTGGTTTGAAGG/^ACTTGCTCTTCATATTTTG 
GATCAGGGATACCGAGATATAGGTATCATTGTCGGTCCTGCTGAGCTCAGCACCGCCCGAGACCGCAAA 
GCCGGATTCATCAACGCCCTCGAAACCGAAGCCACACAACGCGGAATCCGCGAAGAACTACGATTCCGG 
GTAGTT 

CACTCCCGCTACTCCCCCACCGGCGGTTATGAAGCATTCGCAGAATTCCGCAATGATCTCCCTCAAATC 



Appendix A, page 65 



ley Docket No.: BGI-123CP 



GTGGTGCCCCTGAGCACGCAATTAACTCTAGGAGTTCTCAAAGCAACCCAAGAAAACGGCATAAAAATA 
TCGGATGACCTGTCACTTGCTTGTTACGGCGTCGCCGAATGGCTCGCAGTGTGGGGCCCTGGCATCACC 
GTTTTC 

GCACCAGACCTCCCAGCCATGGGCGCCGCAGCTGCCACGCAGGTTTTAACGCTTCTCGACGCCGCCCCA 
CTCCCCGAAAACCACTTAAGCATTCCGGGGCAGCTCATTGTCCGTGGGACAACTCCAAAGGTT 

>RXS03219-downstream 
TAAAGGTAGAGGCGCACAATAATGAAAATT 
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APPENDIX B: AMINO ACID SEQUENCES 



> RXA00004 (1-471, translated) 157 residues 

VLTQLIESSI FDNVASRESS EFLGHAAIDL LAGLVYEKAT PYAPDEALRV AVYGYIRENL 
GSSQLTVAAV AGAHRIAVRT LHRLFEGEAY GVAELIRHLR LEAVYEDLRD PRLQNLTILA 
IGMRHGISSQ AHLTRLFRAK YGVPPAEFRR GYINSAA 

> RXA00006 (1-435, translated) 145 residues 

MVDFDTIAAR LVTETEEAII YATRDGIIRL WNGGSEKLFG YTAGEALGKS LDIIIPEKHR 
KAHWDGWDRV MESGETRYGS EPLNVPGIRA DGSKMSLEFS ITILKDDSGK lEGVAAFLRD 
VTANWDEKKA LRIRIKELER QIEGH 

> RXA00029 (1-537, translated) 179 residues 

LAQATAQLIA DDEAVIFDNG TTCQAVAQEL AGRPITALCL SLHSAVALGS RAGTNVFIPG 
GPVENDSLAL SGPAVITALR DFSADVVILG SCSTSLEHGL ATTTYDDAEN KRAAIHAATR 
RILVVSARKL NHVSTFRFAD VADLHQLVTT SDAPREILAE IRDLGVQVIT VPAPDEQRS 



> RXA00126 (1-663, translated) 221 residues 

VTTPAENNTL SPETKVSITG RNVEVPDHFA ERVNTKLAKI ERLDPTLTFF HVELQHEPNP 

RRADESDRIQ ITATGKGHIA RAEAKEDSFY AALETALAKM ERSLRKVKAR RSISRSGHRA 

PLGTGEVGAQ LVAESQEARG ADELGKYDVD PYADKVDDVM PGQVVRTKEH PATPMSVDDA 

LSEMELVGHD FYLFVNEETN QPSVVYRRHA FDYGLISLSD A 



> RXA00129 (1-1497, translated) 499 residues 

VLGSIFTASA VVMILLGLGM LTVFTQRLVD QKIDIASSEI DRARVIVEEQ ITASGASTSV 

QARVNSARAA LSSLGTSGGT ETNAAYDPVV LVNNDDLVVS PEGYQIPERL RYFVSENQVS 

YQFSSIDQGD GSSYQALIIG TPTESDIPNL QVYLVFSMES DESSLALMRG LLSAALLIVV 

VLLVGIAWLA TQQVTAPVRS ASRIAERFAQ GKLRERMVVE GEDEMARLAV SFNAMAESLS 

AQITKLEEYG NLQRQFTSDV SHELRTPLTT VRMAADLIAD SEDELSPGAR RASQLMNREL 

DRFESLLSDL LEISRHDAGV AELSTALHDV RIPVRSALEQ VQHLATELDV ELLVNLPEEA 

INIQGDSRRI ERIIRNLLAN AIDHSKGLPV ELKVADNVDA VAIVVIDHGV GLKPGQDELV 

FNRFWRADPS RVRHSGGTGL GLAISREDAM LHGGNLDAAG TIGVGSIFRL VLPKEPHGNY 
REAPIPLIAP ETPWEGEQQ 

> RXA00130 (1-678, translated) 226 residues 

MSQKILVVDD DPAISEMLTI VLSAEGFDTV AVTDGALAVE TASREQPDLI LLDLMLPGMN 

GIDICRLIRQ ESSVPIIMLT AKTDTVDVVL GLESGADDYV NKPFKAKELV ARIRARLRAT 

VDEPSEIIEV. GDLSIDVPAH TVKRNGAEIS LTPLEFDLLL ELARKPQQVF TREELLGKVW 

GYRHASDTRL VNVHVQRLRA KIEKDPENPQ IVLTVRGVGY KTGHND 



> RXA00182 (1-3102, translated) 1034 residues 

MTSHLLHGLW IKDRGLQLWI EQVEGHRIVL PEAVEKGTFP PVVEQILDGK TFRARMNVHL 

RTPKGRHVEL PTPTAAFTPE EAVTVFSQLS FLKAETPAAT RAQRDSIAPD LWWLIVMYQG 

LARFVQAGRV TLRTVMMDNA WWPQWQLSAS LSERGWLAEM NHAAPGILRI NGGRDLAGSM 

SNELPHWIAN AILRDYRDET MPYARHEFVE ALLFNHSLRK GSTMLTHALN QWKNTITSAS 

LQLVILVEEP PAESDYEDPM DSVWPVRLMV RTGVDAPQAI QKGSIDSGGM EQLRSQYETA 

KTTSMLLDPA REDAMLGHMV DIAQNGDWDI FLTTEEIVNF ISHDVAKLRK AGIPVMLPKA 

WSTYETRAQV EARTPNDAAD SSTKAIIGLD QLVEYNWRIS VGDIQLSDEE MRELIDSKTG 

LIRLRGDWVM ADQDALRRIT SYMEELSKSS EKRARTEMEK VAMQAKLAEA NGEEGWQLLA 

AKAETLRKEF NEKFSGDGQG EVTLAELREI ALKAAENEPV EFTGSQWFNS LLGGTETPAP 

VRVDIPDTVL ADLREYQRRG VDWLYWMSAN NLGAVLADDM GLGKTLQLLS LLAVERAENP 

ELERGPTLVV CPTSVVGNWA AEAAKFVPSL KVLMHHGPQR LNDADFLSQS KGMDLIITSY 

GVITRDFKLM GQVGFERVVL DEAQAIKNSS TRVSKAVRSL PSRHRVALTG TPVENRLSEM 

RSILDFCNPG VLGSASFFRN HFAKAIEREQ DDTMTERLRQ LTAPFILRRL KTDPNIIDDL 

PEKTEQIIRV DMTTEQASLY KALVEDVQKQ LDERQGMSRK GLVLATITRI KQICNHPAHF 

LGDGSEVTLK GKHRSGKVEA LMELIDTAVK EERRMLIFTQ YAAFGRILAP YLSDRLGTNI 

PFLHGGVTKP GRDRMVAEFQ SEDGPPAMIL SLKAGGTGLN LTAASIVVHM DRWWNPAVEN 

QATDRAFRIG QRKNVDVYKM ITVGTMEESI QDILDGKTHL ASAIVGEGEG WITELNPEEL 
AMLMSYREKE GADD 
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> RXA00221 (1-219, translated) 73 residues 

MNAEEIGMAL LNGRKELGLR QGELADLAGV SERFIRDVEK GKTTVRLDKV IDVLRVLGLE 
LSVGIHDPLK VNQ 



> RXA00253 (1-738, translated) 24 
MPSETMKPAV ASTLAATSTG RRPGRPTQRI 
MGVTPRALYN HVLNRQEIID RVWVRIIDDI 
VLLVALDEQI STQGTSPLRI AGAEESLKFL 
TFDNRPEGEK PDVFAPVPKP WLDENPDVEA 
KLLAAK 



6 residues 

LSVESIVERT LNIAGREGFA AVTMNRLARD 

KVPDLDPDNW RQSIHTLWSS LRDQFRETPR 

TDIGLSLKEA TIIREMMMAD VFSFTLTSDY 

PLTRPCAVEES VSTSDELFGY MVEARIAYIE 



> RXA00284 (1-1065, translated) 355 residues 

MARKLKDKLP RSFDKIVESG DFDAFKEVFT ERALDAKNRH GNTALHMRGV PEEFKIWMLD 

QGLDVDIRNE DGDTPLHVHS HDWNLSPDFL LKRGADVCAV NNEGESVAYS AAFFPENLKK 

LIDAGADPYS RANDGTTPLM RVIRSADTGQ IIELAEITKL LSGTEFTDAE FRETQERIIA 

MGERFEDVRE VYNEESVDQA SADMIWLYDR FDIPEELRAN TPILHDGVSP lELPGDTWQE 

QFIEGYDLLV PAMGKAKSLQ GEAIRIAGRV SNEFHGNGGV NWDKDFKRMA KSLNHICEQG 

VPLGEPELEE LAAAVKSVRK GEPTEEEIDT LPRLATKWVA QNPQPLPLGE VDYKR 



> RXA00287 (1-474, translated) 158 residues 

MHHLRYESPl GELLLVASDQ GLTYVAFSDE NYAACTVGST PGTNAVLEQA VAELEEYFAG 
KRKEFSTPLD WPSQNLLSFR GKVQEFLLSI PYGESKTYKQ lAAELNNVGA VRAVGSACAT 
NPLPIFAPCH RVLRTDGALG GYRGGLEAKQ WLLELERP 



> RXA002 91 
AALALISVLG 
NGTIGVHNEQ 
VRRGDQDLGY 
HDAAEFLGDI 
VKDPEDVATI 
ATATQIGDSE 



( 1-107 4 , translated) 
ILIGVGVAMG MRRRWERVTL 
AQSMIGAGPM SGRTLKELGL 
VVTIRDRTDI lELSERLDSV 
SRNGGQSHPL IGSAHLNEAF 
LGNLINNAID AAVAGEAPRW 
DNERTHGHGI GLKLCRALAR 



358 residues 
GLQPEELVTL 
DLGLDGVVLH 
RTMTHALRAQ 
LSSFLSTASI 
lELTLMDDAD 
SHGGDVWVID 



VQNQTAVIDG 
GQHPETVAHN 
RHEFANRIHT 
SASEKGVSLR 
TLVISVADSG 
RGTEDGAVFG 



IDEGVLALSP 
GRILYLDFHP 
ATGLIDAGRV 
INSDTLILGT 
PGIPEGVDVF 
VKLPGVME 



> RXA00292 (1-654, translated) 218 residues 

MDQTLKVLVI DDDFRVAGIH ASIVDASPGF SVVGTARTLA EAKTLIATFS PDLLLVDVYL 

PDGDGIDLVG TSNIDAFVLS AADDIKTVRR AMRAGALGYL LKPFPQKRLV ERLDRYVRYR 

HVLSGTQGLS QDKIDQATAI LNGTQAPVTV SRSATEQLLL DALEGQELSA TEASEAAGVS 

RATAQRRLAA MASQGVIQVR LRYGQSGRPE HLYSKPLL 

> RXA00307 (1-462, translated) 154 residues 
VKDLVDTTEM YLRTIYELEE EGIVPLRARI AERLEQSGPT VSQTVARMER 
SLEMTPEGRS LAIAVMRKHR LAERLLTDII GLDIHKVHDE ACRWEHVMSD 
DDVHRSPFGN PIPGLGEIGL DQADEPDSGV RAID 

> RXA00319 (1-426, translated) 142 residues 
MSEVIAKAKA EEAGLEDNVI FSSCGMGNWH VGQPADKRAL AELKSAGYNG DTHRAAQLGP 
EHMRADLFVA LDSGHAGELA ATGVPNDKIR LMRSFDPESN PTDDVADPYY GTSQDFVLTR 
ENIEDAMPGL LEWVRDHIRT DS 



DGLVHVSPDR 
EVERRLVEVL 



> RXA0O348 (1-456, translated) 152 residues 

TRERLENAQY QVQRDRVRGA MEVFIEAGID PGTVPIMECW INNRQHNFEV AKELLETHPD 
LTAVLCTVDA LAFGVLEYLK SVGKSAPADL SLTGFDGTHM ALARDLTTVI QPNKLKGFKA 
GETLLKMIDK EYVEPEVELE TSFHPGSTVA PI 



> RXA00350 (1-327, translated) 109 residues 

LPAKITDTRP TPESLHAVEE ETAAGARRIV ATYSKDFFDG VTLMCMLGVE PQGLRYTKVA 

SEHEEAQPKK ATKRTRKAPA KKAAAKKTTK KTTKKTTKKT TAKKTTKKS 

> RXA00363 (1-684, translated) 228 residues 

RSLTDQVMDF VRESTLDKTM VTGEWYSVYQ VSDQLGISRS PVRDALLRLE EAGLIRFTRN 

RGFQIVETKP SDVAEIFALR LGIEPAAAYR AAQLRTEEQL HEADDIIALM AQAEADNDEE 

AFFTHDRQFH RQIMTMGHSQ RGADLVEKLR AHTRILGAST AGNKRTLGDI LEEHEPILDA 
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IKRQSAEMAR ATMREHIQVT GKLLLEQAVE KSGEGAAQKI WDQYTAGV 



> RXA00400 (1-879, translated) 293 residues 

LFTLEQLRCF VAVANHLHFG KAAAELSMTQ PPLSRQIQKL EKIVGATLLD RDNRKVELTT 
AGFAFLKDAR LILNSTEKAA ERARLASSGM WGQLNIGYTA AAGFSILGPT LNQLHEKMPG 
VSVDLFEMVS TEQIAALESG LLDLGIGRLS SPVEGLQTRR LQADSLVLAA PKGHPLLDQN 
RPLLRKHLTG VPFLQHSPTK AKYLYDIVVR NFTINDAQVQ HTLSQITTMV SLVASGLGVA 
LVPESAKKLN YSGVEYRHFY DLPVGLAELQ AIYSTSNDNP AVRKFIKNID DTF 

> RXA00464 (1-258, translated) 86 residues 

VWFGENLPVE EWDIAEQRIA EADLMIIVGT SGIVHPAAAL PQLAQQRGVP IVEISPTRTE 
LSRIADFTWM STAAQALPAL MRGLSA 

> RXA00494 (1-297, translated) 99 residues 

MTLPHQLPGP NADFWDWQLH GTCRGETSDV FYHPDGERGR ARQRRELRAK AICAACPVLE 
SCRKHALAVA EPYGVWGGLS ESERLVILRN NERKQPVAV 



> RXA00516 (1-720, translated) 240 residues 

MVQKDAQASP ATRKADQVYT QIRREIEDGT LNPGQRMSEV WLVEHTGASR TPVRDALRRL 
AADELIILEP RQAPMVSPLS LRHIKDLFEF RRIVEVAALE EISVGASKSP RIFGEFSTLA 
ADFRELENSA DDADFTADFR RLTSKFDDLV AANTHNQFLG RSILSLKPHT TRLRIIAHSD 
HARLRQSVQE HIEMCEAVAS GDLRSAGAAC RQHLIHVEKS ILTALINADS TGSQGIDIRS 



> RXA00551 (1-348, translated) 116 residues 

MLAGMPNLNA EELAVRVRPA LTKLYVLYFR RSVNSDLSGP QLTILSRLEE NGPSRISRIA . 
ELEDIRMPTA SNALHQLEQL NLVERIRDTK DRRGVQVQLT DHGREELERV NNERNA 



> RXA00583 (1-738, translated) 24 
VYERRLLREL DGAKQPGHVA IMCDGNRRWA 
VNLVTVYLLS MENLGRSSEE LQLLFDIIAD 
LRKAEEATVN NTGIAVNMAV GYGGRQEIVD 
HLYTSGQPDP DLVIRTSGEQ RLSGFMLWQS 
SRRFGK 



6 residues 

REAGFTDVSH GHRVGAKKIG EMVRWCDDVD 

VADELARPET NCRVRLVGHL DLLPDPVACR 

AVQKLLTIGK DEGLSVDELI ESVKVDAIST 

AYSEIWFTDT YWPAFRRIDF LRAIRDYSQR 



> RXA00592 (1-459, translated) 153 residues 

MASNSERLAE LGISLPSVAA PVAAYVPAIQ TGNQVWTSGQ LPFVDGQLPA TGKVGAEVSA 
EDAEKLARAA ALNALAAIDA LVGIDKVTRV LKIVGFVASA DDFSGQPAVV NGASNLMGEV 
FGEAGAHARS AVGVAELPLN SPVEVEVIVE lAQ 

> RXA00593 (1-348, translated) 116 residues 

MTSVIPEQRN NPFYRDSATI ASSDHTERGE WVTQAKCRNG DPDALFVRGA AQRRAAAICR 
HCPVAMQCCA DALDNKVEFG VWGGLTERQR RALLRKKPHI TNWAEYLAQG GEIAGV 

> RXA00603 (1-453, translated) 151 residues 

MKLDSIDRAI lAELSANARI SNLALADKVH LTPGPCLRRV QRLEAEGIIL GYSADIHPAV 

MNRGFEVTVD VTLSNFDRST VDNFESSVAQ HDEVLELHRL FGSPDYFVRI GVADLEAYEQ 

FLSSHIQTVP GIAKISSRFA MKVVKPARPQ V 



> RXA00609 (1-666, translated) 22 
MSKILLAEDD AGIADFIVRG LIREGFECEV 
TDVLEQLRNL QVTLPIIVLT ARTNIEDRLR 
TPQETPTDAR VLRNGDLELD LRTQRVLIDG 
RLVWDMDWDP GSNVVDVYIR ALRKKIGAHR 



2 residues 

TESGAEAFAR AHSGDFDLMV LDLGLPHMDG 
TLEGGADDYM PKPFQFAELL ARIKLRLAKH 
SWHDLSRREV DLLETLMRHP GQILSRVQLL 
VETIRGSGYR LR 



> RXA00638 (1-384, translated) 128 residues 

MEEIKMDNQS DGQIRVLVVD DEPNIVELLT VSLKFQGFAV MTANDGNEAL KIAREFRPDA 

YILDVMMPGM DGFELLTKLR GEGLDSPVLY LTAKDAVEHR IHGLTIGADD YVTKPFSLEE 
VITRLRVI 



> RXA00645 (1-2331, translated) 777 residues 
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LGAHSANSIR 
NLVFHDEVVS 
KEVLHSESPD 
GNDPTGMLCD 
GKWNQKDAQL 
SNKYLLESID 
ASEELLAQHP 
SLIGQSILDG 
RISLWQDGWL 
DLARNYMSRL 
GFWPWQDVHA 
KGFKRFDDAL 
MVTLANRERR 



GVIDRLDAST 
LPSIADSTYV 
HWRMPNPNIP 
DAVRSGLLRV 
EHEAFAAIDP 
SLIAAADLPQ 
IDPIHGPRMA 
CLPEDKPIPG 
ARSLLLLGEW 
STDQDSFIVQ 
THLIRIGETE 
DMIDPLTLPY 
VGGLGQRSEQ 



VVIVADVHWA 
LPPMSIEEIR 
IPQSWHANLL 
LPSDGQPQVD 
NDPAVRALAQ 
ARSRASTLDL 
QRKVLLNLVD 
ETTLHAQRRH 
ESAARTVEIG 
SMPSAMCRMW 
RAQELVNSTL 
YRARICFEYG 
AGGLTPQEYE 



DVESMQKLIE 
QLALTDVRGR 
RRITNEEVWH 
LVLPIDRAVL 
RGYALGRTGH 
GETGIQQDSM 
WNPEELLVWA 
MAMGWLSMVH 
LARAEQFGIR 
VHRHRNEIPG 
EELRGSDIMS 
QALRRQGQRR 
lARLVSSGHA 



YSMRMVSGRF 
ISTTTATDIQ 
VLLAVAVLPS 
QSRTPLNILA 
WMESAHALSL 
LGYLAIHEGR 
DRAVAWTEED 
DDPVTARQKL 
FLEPLLLWSG 
AIVAGEQLEK 
AHAKIAVPDA 
RADEQFARAA 
NREVAQELFL 
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ALIMIGLDEE 
RITGGIYGRV 
GGPIDLVKLI 
QLHHKAAEYY 
AANRTAHQEE 
RLEARNLLHR 
AGEKVEAQAI 
ERRTSINGSE 
ATIATARGNS 
lAAHKHVNAP 
MLMIHHGDVK 
SLFQDMGADA 
SPKTVEY 



> RXA00651 
MQSSLDRVSE 
ALSCVLLFVW 
YLQVMPDVRG 
EKQELIDQLI 
AEMEEKPKEA 
NFVISVDGDV 
VGFEPSEVSS 
APVLADSDSS 



(1-1332, t 
TGRNELDVET 
GFLYFYGSTK 
IIAILGATAI 
ETRSQLAVTE 
IVKKMRLARQ 
RQLPMKTEAT 
TPAGLGHIGL 
ATGEVELSSP 



ranslated) 4 
LVKKGNQPGA 
RVDLSHGMQL 
AIASQYSVGL 
RNAGIAAERQ 
TASDNLSEAR 
LLRIAQGAIG 
TALQQRAMEL 
TDDE 



44 residues 
MSYRNSIHIL TASLLVVGLG 
GWLFVLTLVW IFMVPIVPVS 
TFGGVMGPVV SAIVTVAIDY 
RIAHEIHDTV AQGLSSIQML 
AMIAALQPAA LSKTSLEAAL 
NVAKHSEAKN CHVTLTYEDT 
HGEVIVESAY GQGTAVSAAL 



ASARLTLPMF 
lYLLFPLFFL 
AFRTLWRVNN 
LHVSEQEILV 
HRVTEPLLGI 
EVRLDVVDDG 
PVEPPEGFVG 



> RXA00655 (1-639, translated) 213 residues 

VAASASGKSK TSAGANRRRN RPSPRQRLLD SATNLFTTEG IRVIGIDRIL READVAKASL 
YSLFGSKDAL VIAYLENLDQ LWREAWRERT VGMKDPEDKI lAFFDQCIEE EPEKDFRGSH 
FQNAASEYPR PETDSEKGIV AAVLEHREWC HKTLTDLLTE KNGYPGTTQA NQLLVFLDGG 
LAGSRLVHNI SPLETARDLA RQLLSAPPAD YSI 

> RXA00813 (1-1131, translated) 377 residues 

MTDIDLVVEN VQRIIATKET PPTSAEIASL IREQAGVISN EDIVMVLRRL RSDSVGVGPL 

ESLLALPGVT DVLVNAHDSV WIDRGQGVEK VDMDLGSEEA VRRLATRLAL TCGRRLDDAQ 

PFADGRITRD DGSVLRIHAV LAPLAESGTC ISVRVLRQAR LSLDDLIQSG TVPEDIAPAL 

RNIINQRRSF LVVGGTGTGK TTLLSAMLTE VPADQRIICI EDTAELHPGH PSTINLVSRQ 

ANVEGAGAVS MADLLKQSLR MRPDRIVVGE IRGAEVVDLL AAMNTGHDGG AGTIHANSIS 

EVPARMEALA ATGGLDRMAL HSQLAAAVDI VLVMKHTPFG RRLAQLGVLR GNPVTTQVVW 
DLDHGMHEGS EEAWFMP 

> RXA00822 (1-681, translated) 227 residues 

VEGVQEILSR AGIFQGVDPT AVNNLIQDME TVRFPRGATI FDEGEPGDRL YIITSGKVKL 

ARHAPDGREN LLTIMGPSDM FGELSIFDPG PRTSSAVCVT EVHAATMNSD MLRNWVADHP 

AIAEQLLRVL ARRLRRTNAS LADLIFTDVP GRVAKTLLQL ANRFGTQEAG ALRVNHDLTQ 

EEIAQLVGAS RETVNKALAT FAHRGWIRLE GKSVLIVDTE HLARRAR 

> RXA00839 (1-369, translated) 123 residues 

MPPVTHPEFR NVAIVAHVDH GKTTLVNAML EQSGVFSDHG EVADRVMDSG DLEKEKGITI 

LAKNTAIRRK GAGKDGNDLI INVIDTPGHA DFGGEVERAL SMVDGVVLLV DASEGPLPQT 
RFV 



> RXA00845 
SSFLGRIGLV 
GDIAAISGIE 
VVKARLENEL 
QTIDGKLHEP 
IGFRTQFMTE 
F 



(1-903, translated) 301 residues 

RVHAGTLRKG QQVAWIHYDE EGNQHTKTAK lAELLATVGV ARVPATEVVA 
DIMIGDTLAD PENPVALPRl TVDEPALSMT IGVNTSPMAG RGGGDKLTAR 
IGNVSLKVNP TERPDTWEVQ GRGEMALSIL VETMRREGFE LTVGKPQVVT 
YEIIVIDVPS EYQGNVTQLL ATRKGLMQSM STTPGSDWIR MEFRIPARGL 
TRGTGIANSY SDGMDVWAGE IKGRAHGSLV ADRSGQITAY ALTQLADRGS 



> RXA00849 (1-321, translated) 107 residues 

MVTYTTLLDK PISESAPRKA PEPLLREALG AALRSFRADK GVTLRELAEA SRVSPGYLSE 
LERGRKEVSS ELLASVCHAL GASVADVLIE AAGS^4ALQAA QEDLARV 
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> RXA00885 (1-1026, translated) 342 residues 

MVSATEKRRY EVLRAIVADY lASQEPVGSK SLLERHKLNV SSATIRNDMS VLESDGFIVQ 

EHASSGRVPT ERGYRLFVDS IHDIKPLSLA ERRAILGFLE GGVDLEDVLR RSVQLLSQLT 

HQAAVVQLPT LKTARVKHCE VVPLSPMRLL LVLITDTGRV DQRNVELEEP LAAEEVNVLR 

DLLNGALGEK TLTAASDALE ELAQQAPTDI RDAMRRCCDV LVNTLVDQPS DRLILAGTSN 

LTRLSRETSA SLPMVLEALE EQVVMLKLLS NVTDLDQVRV HIGGENEDIE LRSATVITTG 
YGSQGSALGG LGVVGPTYMD YSGTISKVSA VAKYVGRVLA GE 



> RXA008 94 
MGIEFKRSPR 
VTGVCDTVPE 
RTQYWGNQML 
RTMLYQQLPT 
DSTSNLRELS 
RDTDEAMVQD 
AVDLACDELN 



(1-1128, t 
PTLGVEWEIA 
AVAELSHDLD 
IWGIHVHVGI 
AGLPYQFQSW 
AIVALTHCLV 
ELRRLVAQLM 
DLKALD 



ranslated) 376 residues 



LVDPETRDLA 
ALKEAADSLG 
SHEDRVWPII 
DEWCSYMADQ 
VHYDRMIDAG 
PLANELGCAR 



PRAAEILEIV 
LRLWTSGSHP 
NALLTNYPHL 
DKSGVINHTG 
EELPSLQQWH 
ELELVLEILE 



AKNHPEVHLE 
FSDFRENPVS 
LALSASSPAW 
SMHFDIRPAS 
VSENKWRAAR 
RGGGYERQRR 



REFLQNTVEL 
EKGSYDEIIA 
DGLDTGYASN 
KWGTIEVRVA 
YGLDAEIIIS 
VFKETGSWKA 



> RXA00947 (1-336, translated) 112 residues 

MARKLEHPSL AEMNLNAIMF ALSDPIRRQI LSQLSCGHND QACVAFELPV SKSTSTHHFR 
VLREAGLITQ RYEGTAILSA LRSEDMEARF PGLLTSVMRA EVEERNAADL PV 

> RXAOlOOl (1-318, translated) 106 residues 

VTVSWHQATD APPSIRITTL APSLQPNQRK VAEVMLVDAP SIVELTAQGL ADRVGVGRAT 
VIRTAQSLGY DGFPQLRVAL AQELALAQGA SRSMVEGALS SSLLGH 



> RXA01065 (1-582, translated) 194 residues 

ILEAVRKVSP KTPILGIITK ADSVSRDLVA AQLMAVHELL GGNSEVVPVS STSGENVETL 

IKVMTDLLPE GPKFYPDDHI TDEDTNTRIA EAIREAALSG LKNELPHSVA VEVDEILPDP 

ERNGVLAVHA IIYVERVGQK DIIVGHKGQR LGRIIHTSRQ DIIKILGQNV FLDLRIKVLK 

NWQSDPKALN RLGF 

> RXAOlllO (1-573, translated) 191 residues 

MLAIVQLSKE SIIGAAVSIL SEFGLSDMTM RRVAKQLNVA PGALYWHFKN KQELIDATSR 

YLLAPVLGRN DEQRASISAQ ETCAEMRSLM MQTKDGAEVI SAALSNQQLR QELESLISDS 

LKEPNEVGAF TLLHFVVGAV LTEQTQLQMH EFTAGAGDDT QENPADANFE ERFNQGIEII 

LVGLDALGHI R 



> RXA01118 (1-765, translated) 255 residues 

MVEQSPDFVQ SFARGLSVIR SFSADNPSQT LSEVASQTGL SE^TARRFLH TLTDLGYAVN 

NDSRFQLTPR VLELGASYLS ALSLPAIAQP RLEVLSRQVG ESSSMSVLDG TDIIYVCRVP 

VRRIMTVNIT IGTRFPAYAT SMGRIMLANL PEEELDEMLA AAPPEQLTTR SLTSIASIRE 

EIIATRERGW SLVDQELEPG LRSLAAPITN AQGEVVASIN VSTQSASHSV EDIRKLVLPQ 

LLETAQAIST DLSAL 



> RXA01125 (1-213, translated) 71 residues 

MAIIVDIDVM LARRPCMGVGE LAEKIGITPA NLSVLKNGRA KAIRFSTLEA ICRELGCQPG 
DILRYDASLH N 

> RXA01211 (1-795, translated) 265 residues 

MNKDFWTAGW TARWFSRGVS LLASPVTAPL NSWRRLPNLA KYTLYTRVSL QAIPVVLLSA 
YFLGIVANAG TLNPSFVWLL GFSVILLIVT VLVYEYQPSL NSHPRRSVQP FFFTGLVLNV 
LGVVVSVVLQ IPGLNMSDNT RATALIFTLT CVFLLSIAYI PWMNYRWVWL lAMSAVLWWT 
STTTDYLSAL WVVIPPLMAG TVRLSVWTVD VMKEVERSRE LEASLRVTEE RLRFAQELHD 
TLGQHLAAMS VKSELALALA KRGDD 



> RXA01241 (1-480, translated) 160 residues 

VDVRHLPETE SRSSKAATQA KSKAPQAGVH DPELAGQTSF VPVVGKIAAG SPITAEQNIE 

EYYPLPAEIV GDGDLFMLQV VGESMRDAGI LTGDWVVVRS QPVAEQGEFV AAMIDGEATV 

KEFHKDSSGI WLLPHNDTFA PIPAENAEIM GKVVSVMRKL 
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> RXA01248 (1-429, translated) 143 residues 

MADRTPTTAT PPGRVLVVDD EQPLAQMVAS YLIRAGFDTR QAHTGTQAVD EARRFSPDVV 
VLDLGLPELD GLEVCRRIRT FSDCYILMLT ARGSEDDKIS GLTLGADDYI TKPFSIRELV 
TRVHAVLRRP RTSTTPPQVT TPL 



> RXA01272 (1-603, translated) 201 residues 

MSNSFTILTV CTGNICRSPL AKQLLELELP GADIIRVDSA GVQAMVDSPM PEQSLEIARK 

QGIENPEEHR AKQITEELVN QSDLILAMDR GHRKSIVQLS PRATRKVFTV VDLARLIEAT 

TDADLQEELN LAGDSVIDRL HATVEAARLS RSELNPLDNL ADEDIVDPYG KSQSVYEASA 

SQLIPAIRLI ASYLNKALES A 



> RXA01368 (1-129, translated) 43 residues 
KRICQGCPVR DECLEFALEH DERFGIWGGL SERERRRLKR EIS 



> RXA01375 
VTEKYRPVRD 
NLNLGGGRGI 
VIRVPNDGSS 
RAKDAGREGL 
PAGRQTLGGS 
DAVTRSVVID 
FFQEALGEAE 
ETGNAAEGLY 
RKQTP 



(1-1455, t 
IKPAPAAMQS 
QDGNAADGAT 
ATAVAIPRDT 
IDAVSDLTGI 
DALSYVRQRH 
EGWEIMSFAT 
PAPEDGSDDQ 
YESQILAAEE 



ranslated) 4 
TKQAGHPVFR 
DILLVGSDSR 
YIHDDDYGNM 
TVDHYAEVGL 
DLPRGDLDRI 
QLQNLAGGNV 
SADQAPDLSE 
DSAPCALAISE 



85 residues 

SVVAFVSVLV 

SDAQGNTLTE 

KINGVYGAYK 

LGFVLLTDAV 

VRQQSYMASL 

TFATIPVTSI 

VEVHVLNASY 

ALGGLPSWPT 



LVVSGLGYLA 
EELAMLRAGD 
DARRAELMEQ 
GGVEVCLNNA 
VNQVLSSGTL 
DGTGDYGESV 
VEGLANGIAA 
LPSTTTPSSS 



VGKVDGVASG 
EENDNTDTIM 
GFTNESELET 
VDEPLSGANF 
TNPAKLSALA 
VTIDVNQVHA 
QLQELGYSIA 
YPPAITLALP 



> RXA01418 (1-246, translated) 82 residues 

MLGDRTRLRL LIALHYHGPG EATVSELADI VGVTLPTASA ALQLLADNGV VESFKEGRVT 
RYKLVDATTH TLLHHLGGTH RH 



> RXA01450 (1-564, translated) 188 residues 

VPVTLTLGIV GLPNVGKSTL FNALTRNDVL AANYPFATIE PNVGLVELPD ARLERLSEIF 

GSERILPATV SFVDIAGIVK GASEGEGMGN AFLANIREAD AICQVVRAFA DENVIHVDGE 

VNPATDISVI NTELILADLQ TVEKALPRLE KDARKDKGLG EVVDETKKAL AILSDDRTLF 
LCSKSWRH 



> RXA01451 (1-567, translated) 189 residues 

MTAPCFSAAK AGDIDLALLR DLHLMTAKPF LYVFNSDEKV LTDDAKKDEL RALVAPADCV 
FLDAQTETEL LELEEDEAAE LLEAVGQTEP GLHSLARAGF ETLGLQTYLT AGPKESRAWT 
IHKGDTAPQA AGVIHSDFER GFIKAEIVSF EDLDAAGSiyiA EAKAQGKVRQ EGKDYVMVDG 
DVVEFRFNV 



> RXA01500 (1-444, translated) 148 residues 

MATHPDIPTE LLESPSYQLE RLRRRTRDHV EAELAKHETT MREFWTLTCL VHSDAASQSV 

LCELLAIDAS DMVRLVDSLE VRGWAKRERD PKDRRRQIVA STKKGKNAQA DLHKVVLEAE 

DAALDESTSK QLKHLRKLAA AIISTEED 

> RXA01537 (1-651, translated) 217 residues 

MTQAIAASLD LAARITAKID QGVLTPGTRL PEVALAEELG VSRNTLREAF RVLMQDGLVD 

HIPNRGVFVH TFTKSDVEDI YAYRTFIEVA AIRSARKNPQ LLEQSLGVMR EAYERGAAAN 

AVGDWQTVGS ANSAFHLAIV DLAGVARLSA DARKVLALAR IGFMATYNVE TFHSIYVEKN 

HQILKYLAAG EFEEAEQYLQ KYFEDSRDDL SAHLPEF 



> RXA01573 (1-2082, translated) 

MKRLSRAALA VVATTAVSFS ALAVPAFADE 

TGYSEMGASG VACYVDAERA DNPNTRFITV 

SALGNHEFDQ GYSDLVNRVS LDGSGSAKFP 

AVTEETATLV SPAGIEGITF TGDIDAINAE 

VDVVFSGHTH FDYVAEGEAR GDKQPLVVIQ 

SATDVVENCE TPNTAVDAIV AAAVEAAEEA 

SLSNLIAEAG LWAVNDATIL NADIGIMNAG 



(94 residues 

ASNVELNILG VTDFHGHIEQ KAVKDDKGVI 
GDNIGGSPFV SSILKDEPTL QALSAIGVDA 
YLGANVEGGT PAPAKSEIIE MDGVKIAYVG 
ADRVIEAGEA DVVIALIHAE AAPTDLFSNN 
GHEYGKVISD VEISYDREAG KITNIEAKNV 
GNEVVATIDN GFYRGADEEG TTGSNRGVES 
GVRADLEAGE VTFADAYATQ NFSNTYGVRE 
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VSGAQFKEAL EQQWKETGDR PRLALGLSSN 
YRVTGSSFLL AGGDSFTAFA EGGPIAETGM 
GPAVAEDGTL VPGEELTVDL SSLSYTGPEA 
KATVTLTVPE GATSVKIATD NGTTFELPVT 
GVLGALGGLV AFFLNSAQGA PFLAQLQAMF 
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VQYSYDETRE YGDRITHITF NGEPMDMKET 

VDIDLFNNYI AAHPDAPIRA NQSSVGIALS 

KPTTVEVTVG TEKKTADVDN TIVPQFDSTG 

VNGEGNNDDD DDKEQQSSGS SDAGSLVAVL 
AQFM 



> RXA01655 
MLADLPIALN 
YDQLAGEGYL 
SAWRAAWREA 
LRTMDAPARI 
YPYGGSLPAD 
SVITPQVACG 
YRHRRSIVQD 
TGADNGIVFG 



(1-1359, t 
PHEPTSIPTQ 
STARGSGTTI 
CAKPPTHSPE 
GVESPGYPSL 
RRTALVAWAE 
YLIAPTPQAR 
TLGDLPNTQL 
FGSHDEDTLR 



ranslated) 4 
LTEQIRRLVA 
NPDLHLLKPV 
QGLLRLRIEI 
RRIPQVLGHE 
ANDALLIEDD 
VLATLRGILG 
RPINGGLHAV 
WVLAEISDAV 



53 residues 

RGILTPGDPL 

EIEKKETSRS 

ADHLRQMRGL 

TIDVPTDESG 

FDSELRYVGM 

QPVGAITQHA 

LLCDKPQDLV 

SLG 



PSSRSLSTQL 
VPPPLLNLSP 
MVEPEQIIVT 
LVPRALPHDL 
PLPPLRALAP 
LAS YL AS GAL 
VTTLASRGLN 



GVSRGSVVTA 
GVPDTATLAD 
AGAREGLSLL 
NALLVTPSHQ 
DRTILLGTFS 
RRRTQRLRRL 
VTALSHYWGG 



> RXA01687 (1-1071, translated) 357 residues 

VFMLAQRTLP IHITAPHLPV ARVFHQIRAT DADRTSLQRD LELSQAGITR HVSALIDAGL 
VEETRVDSGA RSGRPRTKLG IDGRHLTAWG VHIGLRSTDF AVCDLAGRVI RYERVDHEVS 
HSTPSETLNF VAHRLQTLSA GLPEPRNVGV ALSAHLSANG TVTSEDYGWS EVEIGAHLPF 
PATIGSGVAA MAGSEIINAP LTQSTQSTLY FYAREMVSHA WIFNGAVHRP NSGRTPTAFG 
NTNTLKDAFR RGLTPTTFSD LVQLSHTNPL ARQILNERAH KLADAVTTAV DVVDPEAVVF 
AGEAFTLDPE TLRIVVTQLR ANTGSQLRIQ RADAYILRTA AIQVALHPIR QDPLAFV 



> RXA01759 (1-762, translated) 254 residues 

MTKRLSLEGL RYAQAVAETH SFSAAAREYG VTQPALSNGI AKLEDRLGEQ LFDRSTQGVT 
PTSFGLHILP LIQRALTEID AITAEAHRLI NSEARSIRVG ISPLINPQLV ARTYTAVREL 
PTAHDLVLRE ANMKELHEGL LAGELNVILI PAVKPLPHFE HRIIDSEPVV IVESTQDSTD 
PIELRETQHE PFILVPDTCG LTTFTNQLFE TNDLALNAYS GEAASYQVLE QWATLGLGSA 
MLPLSKLSSP TAPH 



> RXA01763 (1-465, translated) 155 residues 

MTTSNPTAEI IGGPERFLEA ELSQQIQFLT ARARAKGSAK GNEALVDLGL KVRQYSTLSL 

AASGLKPTQR ELGAFLDLDP SQIVALVDFL EKRGLVAREV DPRDRRSKII lATEKGLEIH 

DEATKRLLIA EGESLKNLTS DEQEQLRELL LKIAF 



> RXA01826 (1-1938, translated) 646 residues 

VTFVIADRYE LDAVIGSGGM SEVFAATDTL IGREVAVKML RIDLAKDPNF RERFRREAQN 
SGRLSHSSIV AVFDTGEVDK DGTSVPYIVM ERVQGRNLRE VVTEDGVFTP VEAANILIPV 
CEALQASHDA GIIHRDVKPA NIMITNTGGV KVMDFGIARA VNDSTSAMTQ TSAVIGTAQY 
LSPEQARGKP ADARSDIYAT GCVMYELVTG KPPFEGESPF AVAYQHVQED PTPPSDFIAD 
LTPTSAVNVD AVVLTAMAKH PADRYQTASE ^4AADLGRLSR NAVSHAARAH VETEETPEEP 
ETRFSTRTST QVAPAAGVAA ASTGSGSSSR KRGSRGLTAL AIVLSLGVVG VAGAFTYDYF 
ANSSSTATSA IPNVEGLPQQ EALTELQAAG FVVNIVEEAS ADVAEGLVIR ANPSVGSEIR 
QGATVTITVS TGREMINIPD VSGMTLEDAA RALEDVGLIL NQNVREETSD DVESGLVIDQ 
NPEAGQEVVV GSSVSLTMSS GTESIRVPNL TGMNWSQAEQ NLISMGFNPT ASYLDSSEPE 
GEVLSVSSQG TELPKGSSIT VEVSNGMLIQ APDLARMSTE QAISALRAAG WTAPDQSLIV 
GDPIHTAALV DQNKIGFQSP TPATLFRKDA QVQVRLFEFD LAALVQ 



> RXA01827 
MSQEDITGKD 
FLNRFRNEAQ 
PEDLALDVME 
TGMVVGTAQY 
PPQMPTSISA 
APSPSESTAM 
DTPEETTTPE 
TEAPTSSRTV 



(1-1407, t 
RLQELIGADY 
AAENIDSEHV 
QAAHGLSVIH 
VSPEQAQGKE 
QTRELIGIAL 
LGRVARPATI 
TITETYTPTV 
PQIPTSTPRT 



ranslated) 4 
RLQWIIGHGG 
VATYDYREVP 
RMDMVHRDIK 
VTAASDIYSL 
RKDPGRRFPD 
TQEAAPKRGS 
EETTSQWVPP 
SASVPVETNA 



69 residues 
MSTVWLADDV VNDREVAIKV 



DPAGHTFCFI 
PGNMLITANG 



VMEFVRGESL 
IVKITDFGIA 



GVVGYEMMAG RRPFTGDSSV 

GNEMALAVSA VRLGKRPPQP 

GIGIGLFIAA LLAVIIGAVI 

TPPTRSTFTE PETTSHRPTT 

PADDLIDAVN GLLDVGGAQ 



LRPEFSDNQE 
ADLLEREGRL 
KAAAAVPLTR 
SVAIAHINQA 
RTSAMMAQAE 
YAGTTGILFN 
SEESTSEEPT 



> RXA01830 (1-1353, translated) 451 residues 

MLKLKYAVAS DRGLVRGNNE DSAYAGPHLL ALADGMGGHA AGEIASQTMI NHLRALDVDP 
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GDNDMLALVG MVAGEANAAI AEGIAEDPAR DGMGTTLTAF MFNGRDLAMC HVGDSRGYVL 

RDDKLVQVTV DDTFVQSLVA EGKLDPEDVS THPQRSLILK AYTGHPVEPT LEQFPALPGD 

RLLLCSDGLS DPVTHSTIEE TVRVGTPQDA STKLVELALR SGGPDNVTVI VADVVEVTEA 

EAAAEASVPV TAGALNGEQP EDPRPDTAAG RAAAITRRAQ VIDPAPKISD AGTEDIPTIE 

EPPEKSSSKL AVLIVALVIL IGVVAAGWWG YSRIDSTFYV AVNDEEAITV EHGVDYRIFG 

KDLHSQFQVA CLNEAGTLSL KESCENGTSF KLDDLPASVR GSVAGLPSGS YDEVQAQMQR 

LAAQALPVCV NLEVTTGGDR NEPGVNCREV S 



> RXA01836 (1-705, translated) 235 residues 

MGQQEIIEDS TESGIKVLDR TVLILNVIAE QPRSLAELAA ATDLPRATAH RLASALEVHG 

MLARSRDNRW TIGARLASLG ARGADTLIDT AVPIMADLME RTGESVQLYR LTGTTRTCVA 

SQEPSSGLKN VVPVGTRMPL NAGSAARVFA AYLPIPSASV FSREELDQVR ASGLAESVGE 

RELGLASLSS PVFDSNGSMI AALSISGVAE RLKPHPAAMW GTELIDAAER LGALL 



> RXA01840 (1-654, translated) 218 residues 

ISEEDGASEP ATFAERSQRL IQQECVAAVF GGWTSASRKA MLPVFEGNNS LLFYPVQYEG 
MESSPNIFYT GATTNQQIIP ALDYLRENGL NRLFLVGSDY VFPRTANSII KDYAEANGME 
IVGEDYAPLG STDFTTIANR MRDSNADAVF NTLNGDSNVA FFRQYNSLGF NADTLPVMSV 
SIAEEEVGGI GTANIEGQLV AWDYYQTIDT PENETFVE 



> RXA01860 (1-885, translated) 295 residues 

VNPFILADQL LYDAKHAGRN RVAVRRAENT IVRSAKPAFS VEELSEILES HSIRLELQPI 
LELETGRVGA AEGLLRINLD GTDVPTGQFV QSVEQAGLAP KLDIAVMREG INHIERLRAV 
CPTFSLALNL SGYSLSSAKI REELRAEFRA RDLPRGSIRF EITETAPIED IDAAKEFVQM 
LKDFGFHIVI DDFGAGHEPY QYLKKFDFSV LKIAGEFIEG MVTNRVDRSI VESIAQLAKD 
EEMETVAEFV SSKEILEAVR EIGVTYAQGF HIGKSKPIDE FIATYLETNQ TATWG 



> RXA018 61 
VVARDLQKLE 
NAQLPELLRK 
LTMALRDLTG 
VYLAMLPLYW 
VLMCILLSLV 
DGHYSMPKLQ 
PMYLRGETEP 
NRSFGDLVGT 
AMSFSSMGNV 
LKNERDSTDS 
FVIVLPDTDR 



(1-19 65, translated) 

KLRLICGYVF LVPAIYLHFF 

RHAFAPFSHI RLPGDVFRLL 

IIVVAGPGIA LSTPLVLNIH 

SATRLPVLLA VLHAVFTSAI 

VSTTVQQTSA LVEELEVVAK 

NIDGEPMDEK ESPSSMALRG 

GHALVIWHDS TNEYYTMQQL 

TPVRLLGRNL EDFGVEEGTM 

GGRIGTLLVN VVDVTERQEL 

ALLLLDLDYF KEVNDSLGHE 

DGAEAI GI RI I ELVNQH FKG 



655 residues 

AETSLRGVIL AGIAHAIAGP GVALVMAFME 

VAGIVMVAIS KLIVILAYAL ADLPYSFTLY 

RSAWREFAVV IIATVGVLAL IFGFAVDLPT 

VVILYFLLGT GSFAITDESI LVQATTIQLF 

TLPDALFIVN KNGTAFPVNA GAKNFVKQSP 

QGVEGVLAKL GEVLGEDPDL ARRIFEISAS 

TLAYEESRLL FEKAPQGIAM LDPSGEIVMA 

EYVTPVLSDP EAVVHLDRSL ETLRGKQKNV 

lELVEHLADH DSLTGLVNRR RLESDIEELI 

AGDQLLIEFA EILKDSVRDS DIVGRIGGDE 

RGKVLSRVSS KVSAGRSFLM LVPKV 



> RXA01898 (1-693, translated) 231 residues 

MSVKAHESVM DWVTEELRSG RLKIGDHLPS ERALSETLGV SRSSLREALR VLEALGTIST 

ATGSGPRSGT IITAAPGQAL SLSVTLQLVT NQVGHHDIYE TRQLLEGWAA LHSSAERGDW 

DVAEALLEKM DDPSLPLEDF LRFDAEFHVV ISKGAENPLI STLMEALRLS VADHTVARAR 

ALPDWRATSA RLQKEHRAIL AALRAGESTV AATLIKEHIE GYYEETAAAE A 

> RXA01935 (1-1164, translated) 388 residues 

MIGYGLPMPN QAHFSASFAR PSTPAAKCMH HIRLGQQLIR NELVEATGLS QPTVTRAVTA 

LMQAGLVRER PDLTLSSGPG RPNIPLELAP SPWIHAGVAI GTKSSYVALF DTKGRTLRDA 

MLEISAADLD PDTFIEHLIA GVNRLTTGLD LPLVGIGVAT SGKVTNAGVV TASNLGWDGV 

DIAGRLNYQF SVPATVASAI PAIAASELQA SPLPHPEQPT PITLTFYADD SVGAAYSNDL 

GVHVIGPLAT TRGSGLDTLG MAAEDALSTQ GFLSRVSDQG IFANSLGELV TIAKDNETAR 

EFLNDRATLL AHTAAEAAET VKPSTLVLSG SAFSEDPQGR SVFASQLKKE YDADIELRLI 

PTHRENVRAA ARAVALDRLL NEPLTLVP 



> RXA02127 (1-654, tr, 
MSETVLVIGA TGSIGRHVVS 
KAVKGVEGII FTHGTSTRKS 
EWKRHGEQLV RASGHGYTIV 
VLVSSLNDAK ARNKTFELSA 



nslated) 218 residues 
EALNQGYQVK AFVRSKSRAR 
DVRDVDYTGV ANTLPCAVKGK 
RPGWFDYNND DERQIVMLQG 
TYGPAQGKPD RNFCSTSG 



VLPAEAEIIV GDLLDPSSIE 
DVKIVLMTAV GTTRPGVAYA 
DTNQSGGPAD GVIARDQIAR 
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> RXA02210 (1-564, translated) 188 residues 

VSVAAGDKPT NSRQEILEGA RRCFAEHGYE GATVRRLEEA TGKSRGAIFH HFGDKENLFL 

ALAREDAARM AEVVSENGLV EVMRGMLEDP ERYDWMSVRL EISKQLRTDP VFRAKWIDHQ 

SVLDEAVRVR LSRNVDKGQM RTDVPIEVLH TFLETVLDGF ISRLATGAST EGLSEVLDLV 
EGTVRKRD 



> RXA02232 
MDEKKNLSHD 
RDTEIHADEA 
TLYQKRDKPD 
TMLILDIFAQ 
ETKIEADRRR 
MTGAGVLVEN 
ADLMLHVVDG 
AVDDVVFVSA 
AEDGTLMDVR 



(1-1527, t 
ELLAQAFRGH 
ADGYEVEYRK 
PGTYIGSGKV 
HAKSREGKAQ 
LRSDMARLRR 
ALFATLDPTT 
SDPFPLKQID 
LTGEGIKELE 
IPTQLAQELQ 



ranslated 
KNTVRPGSDE 
LRLERVILVG 
RELKEIIEAT 
VALAQMEYLI 
ELSGLDTSRS 
RKAELADGRH 
AVNTVISDIV 
ARIELFLNSR 
SYVVEPTSA 



509 residues 
TSGFDLSGFI 
VWTEGTTAEI 
SADTVVCDGE 
SRVRGWGGNL 
IKRAQRAASL 
VVFTDTVGFV 
RSTGAVPPPE 
DAHLLLKIPF 



RAEEPSTGDL 
DASLAELAAL 
LSPSQLVALE 
SRQAGGRAGS 
VPQIAIAGYT 
RHLPTSLVEA 
IIVVNKIDQA 
TRGDIVSRLH 



DLEARDAQRR 
ADTAGAEVIE 
RELDIKVIDR 
NGGVGLRGPG 
NAGKSSLINA 
FKSTLEEVVE 
DPLTLAELRH 
QHGTVLSEDY 



> RXA02270 (1-621, translated) 207 residues 

MDQARPNRTH YAMVELEQHG FLSGVVTQNV DGLHAEAGTK NLVALHGDLA HVMCLNCGFG 
EDRHLFDERL EAANPGYVAS IRLEPGAVNP DGDVFLDEEQ VRRFTMIGCL RCGSLMLKPD 
VVYFGEPVPA ARKKDLKKLL DASSSLLIAG SSLAVMSGYR IVIEAQRQGK QVSVINGGPG 
RADSRVDILW RTRVAPAFDD ILDALDL 



> RXA02306 (1-291, translated) 97 residues 

MQPEEVHIKD ETIKLGQFIK LANLVESGGA AKDAIANGDV TVNGEVDTRR GKTLRDGDVV 
CIGEVCAQVS TGDAADDDYF DEATANDDFD PEKWRNM 

> RXA02376 (1-1503, translated) 501 residues 

MNRFIDRVVL HLAAGDGGNG CVSVHREKFK PLGGPDGGNG GHGGDIILEV TAQVHTLLDF 

HFHPHVPCAER GANGAGDHRN GARGKDLVLE VPPGTVVLNE KGETLADLTS VGMKFIAAAG 

GNGGLGNAAL ASKARKAPGF ALIGEPGEAH DLILELKSMA DVGLVGFPSA GKSSLISVMS 

AAKPKIGDYP FTTLQPNLGV VNVGHETFTM ADVPGLIPGA SEGKGLGLDF LRHIERTSVL 

VHVVDTATMD PGRDPISDIE ALEAELAAYQ SALDEDTGLG DLSQRPRLVV LNKADVPEAE 

ELAEFLKEDI EKQFGWPVFI ISAVARKGLD PLKYKLLEIV QDARKKRPKE PCAESVIIKPK 

AVDHRTKGQF QIKPDPEVQG GFIITGEKPE RWILQTDFEN DEAVGYLADR LAKLGIEDGL 

RKAGAHVGAN VTIGGISFEW EPMTTAGDDP VLTGRGTDVR LEQTSRISAA ERKRASQVRR 

GLIDELDYGE DQEASRERWE G 



> RXA02392 
lyiAEKFAETTF 
ERGITIKAQN 
AQGIEAQTLA 
TGMGVPELLD 
IKMMSTGATH 
QPLRGYQEPT 
LGLLHMEITR 



(1-1260, t 
TDPARIRNFC 
VRLPWIPRSG 
NLYLAMENDL 
KVVELIPAPT 
ELLEIGIVSP 
PMVYSGLFPI 
DRLEREFGLD 



ranslated) 4 
IIAHIDHGKS 
EYEGQQIVMQ 
EIIPVLNKID 
SEFEEDAPAR 
TPKKCVGLGP 
SQADFPDLRD 
LISTAPSVNY 



20 residues 

TLADRILQLS 

MIDTPGHVDF 

LPAADPDKYA 

AMIFDSVYDT 

GEVGYLITGV 

ALEKLQLNDA 

RVIDEAGKEF 



NVVDARDMRD 
TYEVSRALEA 
LEIANIVGCE 
YRGVVTYIRM 
KDVRQSKVGD 
SLTYEPETSV 
RVHNPSDWPG 



QYLDNMDIER 
CEGAILLVDA 
PEDVLRVSGK 
MDGKLTPRQK 
TVTWAIHGAE 
ALGFGFRCGF 
GKLSEVYEPI 



> RXA02450 (1-555, translated) 185 residues 

MNLKDLKAAE TRQRFIDVAH ELFLEHGYGS TSMNQIAQAA GGSRANLYLH FRNKPDLMiXIA 

KMRELEPAVR TPVLKVFDLP EHTLESILRW LDSMTEVWPCA NAKVFGAMEQ AMVEDAAVAD 

EWLSMMQRLS QSVPELVENE ERRVQFLASL MGMDRNFYFL YVRGQDVDEE LLKLAVARQW 
LAVFQ 

> RXA02493 (1-1239, translated) 413 residues 

VSTLLAFVLG VVLMGLALPA YTKIKDRMRR HKSAVTLSEN QVTTVGQVLH LAIQGSPTGI 

TVVDRTGDVI LSNGRAHELG IVHERSVDGN VWRVAQEAFQ DQETHSLDVH PDRNPRRPGS 

RITAVQAVVK PLTLIDDRFV IIYASDESEN VRMESARRDF VANVSHELKT PVGGMALLAE 

ALMESSDDPE QVEYFGSRLH REAHRMADMI NELISLSKLQ GAERLPDMEP VQADDIISEA 

lERTQLAADN ANIEIIRGDR TGVWVEADRS LLVTALANLI SNAINYSPKS VPVSVSQSIR 

NDVVMIRVTD RGIGIAPEDQ GRVFERFFRV DKARSRQTGG TGLGLAIVKH VMANHGGSIS 

LWSRPGTGST FTLELPVYHP ESKEPAGSKQ GPSLDSPIRT TASKASGRRK EKS 
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> RXA02494 (1-696, translated) 232 residues 

MTRILIVEDE ESLADPLAFL LRKEGFDTII AGDGPTALVE FSRNEIDIVL LDLMLPGMSG 
TDVCKELRSV STVPVIMVTA RDSEIDKVVG LELGADDYVT KPYSSRELIA RIRAVLRRRG 
VTETEAEELP LDDQILEGGR VRMDVDSHTV TVGGEPVSMP LKEFDLLEYL LRNAGRVLTR 
GQLIDRIWGA DYVGDTKTLD VHVKRLRSKI EEEPSRPRYL VTVRGLGYKF EL 

> RXA02631 (1-1365, translated) 455 residues 

MSLRWRLALL SATLVAFAVG VITVAAYWSV SSYVTNSIDR DLEKQADAML GRASEAGFYA 

TAETEIALLG EYASDTRIAL IPPGWEYVIG ESISLPDSDF LKSKEAGKQI LVTSAERILM 

KRDSSGTVVV FAKDMVDTDR QLTVLGVILL IIGGSGVLAS ILLGFIIAKE GLKPLSKLQR 

AVEEIERTDE LRAIPVVGND EFAKLTRSFN DMLKALRESR TRQSQLVADA GHELKTPLTS 

MRTNIELLLM ATNSGGSGIP KEELDGLQRD VLAQMTEMSD LIGDLVDLAR EETAETSSIV 

DLNQVLEIAL DRMESRRMTV RIDVSETVDW KLLGDDFSLT RALVNVLDNA IKWSPENGIV 

RVSMSQIDKA TVRIVIDDSG PGIAEKERGL VLERFYRAVS SRSMPGSGLG LAIVNQVVNR 

HGGQLVVGES DDGGTRITID LPGEPIRSGF ENVDD 

> RXA02 632 
MKILVVDDEQ 
EVCRHLRSEG 
VESNQSSSIE 
ILEEVWGCDF 

> RXA02667 (1-594, translated) 198 residues 

MEFKVGDTVV YPHHGAAIIS ALEQREMNGE TVDYLVLQIN HSDLVVRVPA KNAELVGVRD 
VVGEEGLQKV FSVLREIDVE EAGNWSRRYK ANQERLASGD VNKVAEVVRD LWRRDQDRGL 
SAGEKRMLSK ARQVLVGELA LAETVDDEKA DAFLSQVDET lARHRADLLG DEEEKKDAFD 
DFDDSDVDLD DLSFDDED 



DKVAHLRPDV 
ITKPFNIREL 
EFELLQALMH 
IRGVGYRMAF 



> RXA02669 (1-1116, translated) 372 residues 

MTALIPARHS LTFRLLTAQL AVVLISLLAA LIVAALVGPA IFNSHLDLSG PIDPRQTDFH 

IQEAYRDANY lALAAALPTA VLSSIGVSFW LSHRLGQPLW RLSRAATAMS SGDYQVRVPI 

SDVDKEVAAL SLAFNSMADQ LEHTEELRRN MLSDLSHEMN TPLSVLLVYV DGLQDGMVEW 

DADTHAVFAE QLGRLSRLTS DLDDVSRAQE HRFDLVYSTV AIGGLIHNAA GAAAGSYQEK 

GVALEVTGSD STELIRVDSQ RFAQVMANLF SNALRHTPAG GKVHVRVLRQ GVGTIVIEVI 

DNGEGIAPEH VKYVFERYFR AKRSDSDDQS GSGIGLTISR ALIEAQGGTL TAESAGLGKG 
AKFTIRLPLL SK 

> RXA02698 (1-369, translated) 123 residues 

VSSNNESSFA LPDNEPLLTL PETAERLGVV VTKVMDLVNE HKLIVVRRDG IRYIPEAFLS 

TKKENTNRFI PGVIALLADG GFSDEEILAF LFTEDETLPG RPIDALHGQL AREVMRRAQA 
MAF 

> RXA02699 (1-2148, translated) 716 residues 

MSTVYRCLDL RLGRSMALKV MEEDFVDDPI FRQRSRREAR Sb4AQLNHPNL VNVYDFSATD 

GLVYLVMELI TGGTLRELLA ERGPMPPHAA VGVMRGVLTG LAAAHRAGMV HRDIKPDNVL 

INSDHQVKLS DFGLVR7y\HA GQSQDNQIVG TVAYLSPEQV EGGEIGPASD VYSAGIVLFE 

LLTGTTPFSG EDDLDHAYAR LTEVVPAPSS LIDGVPSLID ELVATATSIN PEDRFDDSGE 

FLSALEDVAT ELSLPAFRVP VPVNSAANRA NAQVPDAQPT DMFTTHIPKT PEPDHTAIIP 

VASANETSIL PAQN^4AQN^4A QNPLQPPEPD FAPEPPPDTA LNIQDQELAR ADEPEINTVS 

NRSKLKLTLW SIFVVAVIAA VAVGGWWFGS GRYGEIPQVL GMDEVQAVAV VEEAGFVAVA 

EPQYDNEVPT GSIIGTEPSF GERLPRGEDV SVLVSQGRPV VPDLSEDRSL STVREELEQR 

TFVWVDGPGE YSDDVPEGQV VSFTPSSGTQ LDVGETVQIH LSRGPAPVEI PDVSGMGVDQ 

ATRVLERAGL SVERTEEGFD AETPNGDVYG TSPKVSTEVK RGTSVVLQVS NAISVPDVVG 
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(1-696, translated) 232 residues 

AVRDSLRRSL SFNGYNVVLA EDGIQALEMI DKEQPALVIL DVMMPGMDGL 

DDRPILILTA RDNVSDRVGG LDAGADDYLA KPFALEELLA RVRSLVRRSA 

QALLSCGDLT LDPESRDVYR NGRAISLTRT EFALLQLLLK NQRKVLTRAQ 

PTSGNALEVY IGYLRRKTEL EGEDRLIHTV RGVGYVLRET AP 



> RXA02668 {1-723, translated) 241 residues 
MTNPSPALNE TLSGRVLIVE DERPLARMIS LYLSKAGFDT TTIHDGAAAP 
VILDLGLPGL DGLEVCKRIR AFTDCYILML TARGSERDRI TGLEIGADDY 
VIRIQSVMRR PRKIDETIQN GLTLTYGHIE LDTLAHEVTV KGVGVTLTRT 
KPGEAVSRRD LVSQVWDTTW VGDERIVDVH IGNLRRKLEA PAPGSHFIDT 
K 
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MTKDEATAAL AEEGLVVAST SIIPGEAASS ADAVVTVEPE SGSRVDPAHP QVSLGLAGEI 

QVPSVVGRKV SDARSILEEA GLTLTTDADD NDRIYSQTPR ARSEVSVGGE VTVRAF 

> RXA02724 (1-867, translated) 289 residues 

MLIGEVSKLS GVSARMLRHY EKLGLVEPKQ STAGYREYSE GDVRRIFHIE GLRSLGLSLK 

QVGDALEDPD FDPQAVISEM lAETSARISM ERELLARLKA VRHAQASDWE SALDAVQILR 

RLRSGDPAQR QAVAYDSVSG KEAVALETLV ESALGESHLN AEGALSWAVV QRGEEAVALA 

ARGLRSRDAA VRLRAVRIVA SAPSAVADRV EWLRPMIRDP DALVRAETAL ALGKSGDESA 

VEQLVSMVLT GLRDVEAAEL LAGFGEPVQL DVFKKFARTL DDEETMSPT 



> RXA02747 (1-2076, t: 
MNNPAQLRQD TEKEVLALLG 
GVEDLWYPIW DAKKRLDYSV 
VEPCWRQELNK NFDAVVDTAI 
LPQLDAQHQL LLDARTLLHV 
DDGLTTALAT ARGILPRRTG 
AAAATTGLPV AESTWVRLNE 
RFVPEWDRIK GLMPREPSHI 
PRPHEQVGAE MVARAASRMG 
VRYDLVTLNL LEVLTEADAK 
EIGLVERDGV FTVQWHGEDL 
PQHFLQAYQS GVFSEVPIPA 
TRGATLIIQA ALKPGFDRAT 



■anslated) 692 residue; 
SLVLPAGTAL AATGSLARSE 
RTPDECVAMI SADSTAALAM 
ARWRRSGPVV AMTRPDLKHG 
HARRSRDVLD PEFAVDVAMD 
FAFRNASRRP LDLDVVDANG 
CPPLPEPWPA NAAGDFFRIL 
STIDEHSLNT VAGCALETVT 
LNLRDRASVQ TLVAEHTAVA 
ATGPGVWTAR LEHALRIVCK 
HRILGVIYAK GWTITAARML 
LGITATFWHG NTLEVRTELR 
VERSVVRSLA GS 



LTPYSDLDLI LIHPPGATPD 

LDLRFVAGDE DLCAKTRRRI 

RGGLRDFELI KALALGHLCN 

LGFVDRYHLG REIADAARAI 

TIELSKKPDL NDPALPLRVA 

SSPKNSRRVV KNMDRHGLWS 

VARPDLLVLG ALYHDIGKGF 

KIAARLDPSS EGAVDKLLDA 

RARDRLTDIR PVAPMIAPRS 

ANGQWSAEFD VRANGPQDFD 

TGAIFALLRT LPDALWINAV 



> RXA02760 (1-954, translated) 318 residues 

MSDENINEFE QDEDLNFGAS FSDEFADDDF DAEADVEADA AAEASALEAE QDLEEETLDA 

PEEAAEEAPA AAESEAPVEE DEEADSLAQA AAALGDTDEQ DADAEYKARL RKFTRELKKQ 

PGVWYIIQCY SGYENKVKAN LDMRAQTLEV EDDIFEVVVP lEQVTEIRDG KRKLVKRKLL 

PGYVLVRMDM NDRVWSVVRD TPGVTSFVGN EGNATPVKHR DVAKFLMPQE QAVVTGEAAA 

AAAEGEQVVA MPTDTKKPQV AVDFTVGEAV TILTGAFASV SATISSIDPE LQKLEVLVSI 

FGRETPVDLS FDQVEKVS 



> RXA02763 (1-984, translated) 328 residues 

VKLTDAAREA GVGYGTASRA ISGRGSVDAA TRDKVLAAAE KLGYRTNAMA RALRENKTRT 

VGLIVPGIIN KFYTESATVL QDELDKSGYQ LVVSTTGNDA EKERRAIESM LNRQVDAVVH 

APVNPQAKFP KGFKVVELNR RSDLNRPTVT SDDATGLKEL ALHILDQGYR DIGIIVGPAE 

LSTARDRKAG FINALETEAT QRGIREELRF RWHSRYSPT GGYEAFAEFR NDLPQIVVPL 

STQLTLGVLK ATQENGIKIS DDLSLACYGV AEWLAVWGPG ITVFAPDLPA MGAAAATQVL 
TLLDAAPLPE NHLSIPGQLI VRGTTPKV 



> RXA02787 
MAQDSLFETP 
EGRPLRRLIE 
RARMDLQLGQ 
SLLLQLESLS 
AEAVEDGGVL 
DAGEDPRFIA 
APKSNAVIMA 
EYLPENLRDR 



(1-1377, translated) ^ 
ETPGSAGNTS SVSNSKAASK 
GSGDASVILY GPPGTGKTTI 
RTVLFIDEVH RFSKTQQDAL 
DEDIKTVLNK ALEDERGLAG 
DIDTVMANVN RAVVRYDRDG 
RRLVVHSSED IGMADPSAMQ 
MDAALTDVQQ GHIGTVPAHL 
VYYEPTTHGG EKRIAEYIGR 



59 residues 
YFHPGGHAPL AARMRPRTLD 
ASLISAAAGD RFVAMSALSS 
LSAVENRTVL LVAATTENPS 
RITATDEAVD QLVLLAGGDA 
DQHYDVVSAW IKSIRGSDVD 
VAIAAAQAVQ LIGMPEARIN 
RDGHYEGAKK LGNAVGYSYP 
LRRIIRGTK 



EVVGQQHLLG 
GVKEVRAVIE 
FSVVSPLLSR 
RRGLTYIEAA 
AALHYLARMI 
LAQATIHLAL 
HDDPRGVVRQ 



> RXA02830 (1-495, translated) 165 residues 

LEDSLGVSLF ERAGRGLALT GAGDQLLSQA RRLIALNDEV YARLNAGAYE GEVTLGVPQD 

VIYPVIPRVL QQFARDFPRV QIHLISNFTL MLKEQFRRGE IDVMLTTEDE LGEGGETLAQ 

RELIWVGAPG GSAWTRRPLP LAFERACIFR SFLQRRLDAN SIYWQ 

> RXA02831 (1-408, translated) 136 residues 

MTHRITPELS AELRGVAHSL ADAARPVTLQ YFRTAVAADN KGALRGMAYD PVTIADRASE 

QAMRDILARL RPDDAILGEE FGPKAGTTGL TWVLDPIDGT RAYIAGAPTW GVLIAVSDDQ 

GPLFGIVDQP YIGERF 



> RXA02880 (1-414, translated) 138 residues 
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VETQAFQRQN TGLIAMVAAD ASNPFFLEIF RGAQHAASTQ GYTVALVDAR ESAIKSREVL 
DKIVPHADGL LLAASRMDSG EIHKVAREIP TVLMSREVQG IPSVMVDNYD GAPKAVVHLV 
DQGCRSITYI AGPNKSWA 

>RXN00031 TRANSLATE of: rxn00031.seq check: 1852 from: 1 to: 402 
VAAGQWLAGNIGEIDHVLCSDATRTQLTWERVQLGGATAKGSSFHNDIYENQVSEFKHLI 
TGLPDVVGTALLIGHWPGVEELAHYFGIRDEHPGWDQMEEKFPTSAIAVLEFNTPWSKLE 
RNSARLTDFVIPRG 

>RXN00035 TRANSLATE of: rxn00035.seq check: 7800 from: 1 to: 357 

VPLYKQIASLIEDSIVDGTLSIDQRVPSTNELAAFHRINPATARNGLTLLVEAGILYKKR 

GIGMFVSAQAPALIRERRDAAFAATYVAPLIDESIHLGFTRARIHALLDQVAESRGLYK 

>RXN00049 TRANSLATE of: rxn00049.seq check: 4399 from: 1 to: 687 

MPTPSQHKDASTAQTDNQVPTGRRAQKREQTRARLITSARTLMAERGVDNVGIAEITEGA 

NIGTGTFYNYFPDREQLLQAVAEDAFESVGIALDQVLTKLDDPAEVFAGSLRHLVRHSLE 

DRIWGGFFIQMGAAHPVLMRILGPRARRDLLHGLETGRFTIEDLDLATTCTFGSLIAAIQ 

MALSADQDSNDDKDQIFAAAMLRMVGVQAAEAREIASRPLPEISPVKPQ 

>RXN00291 TRANSLATE of: rxn00291,seq check: 8375 from: 1 to: 1572 

VATVALVVAICTGIFAVLMMDQMKTEAEHTALSIGRWVASNPQIREEVALDTQTGANPSA 

EELADGDIQAVAQAANERTGALFVVITDGLGIRLSHPDEERLGEQVSTSFEAAMRGEETM 

AWETGTLGASARAKVPIFAPDSSVPVGEVSVGFERDSVYSRLPMFLAALALISVLGILIG 

VGVAMGMRRRWERVTLGLQPEELVTLVQNQTAVIDGIDEGVLALSPNGTIGVHNEQAQSM 

IGAGPMSGRTLKELGLDLGLDGVVLHGQHPETVAHNGRILYLDFHPVRRGDQDLGYVVTI 

RDRTDIIELSERLDSVRTMTHALRAQRHEFANRIHTATGLIDAGRVHDAAEFLGDISRNG 

GQSHPLIGSAHLNEAFLSSFLSTASISASEKGVSLRINSDTLILGTVKDPEDVATILGNL 

INNAIDAAVAGEAPRWIELTLMDDADTLVISVADSGPGIPEGVDVFATATQIGDSEDNER 

THGHGIGLKLCRALARSHGGDVWVIDRGTEDGAVFGVKLPGVME 

>RXN00363 TRANSLATE of: rxn00363.seq check: 1381 from: 1 to: 720 

MSDMPTKRVAPARSLTDQVMDFVRESTLDKTMVTGEWYSVYQVSDQLGISRSPVRDALLR 

LEEAGLIRFTRNRGFQIVETKPSDVAEIFALRLGIEPAAAYRAAQLRTEEQLHEADDIIA 

LMAQAEADNDEEAFFTHDRQFHRQIMTMGHSQRGADLVEKLRAHTRILGASTAGNKRTLG 

DILEEHEPILDAIKRQSAEMARATMREHIQVTGKLLLEQAVEKSGEGAAQKIWDQYTAGV 

>RXN00464 TRANSLATE of: rxn00464.seq check: 3000 from: 1 to: 774 

MSERQLEKSIEHAVELAREARNIEVFTGAGMSADSGLETYRDDKTGLWSNVDPQAMASID 

AWRKDPEPMWAWYRWRAGVAARAEPNAGHQAISYWEGSDTVEHVHITTQNIDNLHERAGS 

SDVTHLHGSLFEYRCSDCATPWEDDKNYPQEPIARLAPPQCEKCGGLIRPGVVWFGENLP 

VEEWDIAEQRIAEADLMIIVGTSGIVHPAAALPQLAQQRGVPIVEISPTRTELSRIADFT 

WMS T AAQAL PALMRGLSA 

>RXN00467 TRANSLATE of: rxn00467.seq check: 7535 from: 1 to: 669 

MHISDLPDRSQDYLKTIWDITELLDDQPAALGDIAEKMNQKTPTASEAIKKLAARGLVNH 

EKYAGVTLTEQGKTLAIDMVRRHRLLETFLHDVLGYTWDEVHADADLLEHAASDQLIERI 

DAHLGRPRKDPHGDPIPTAEGVIEESPRTTLEAVQPGETVTISRVKDIDPELLRYLAQYN 

VSPGCRITVASGPLAGMVHVVVEGTDTSFPLAETQLPLITVQD 

>RXN00486 TRANSLATE of: rxnO0486.seq check: 5776 from: 1 to: 909 

VLNLNRLHILQEFHRLGTITAVAESMNYSRSAISQQMALLEKEIGVKLFEKSGRNLYFTE 

QGEVLASETHAIMAAVDHARAAVLDSLSEVSGTLKVTSFQSLLFTLAPKAIARLTEKYPH 

LQVEISQLEVTAALEELRARRVDVALGEEYPVEVPLVEASIHREVLFEDPMLLVTPASGP 

YSGLTLPELRDIPIAIDPPDLPAGEWVHRLCRRAGFEPRVTFETSDPMLQAHLVRSGLAV 

TFSPTLLTPMLESVHIQPLPGNPTRTLYTAVREGRQGHPAIKAFRRALAHVAKESYLEAR 

LVE 

>RXN00551 TRANSLATE of: rxn00551.seq check: 7796 from: 1 to: 471 
MLAGMPNLNAEELAVRVRPALTKLYVLYFRRSVNSDLSGPQLTILSRLEENGPSRISRIA 
ELEDIRMPTASNALHQLEQLNLVERIRDTKDRRGVQVQLTDHGREELERVNNERNAEMAR 
LLEMLTPEQLERTEDLVDIITELAEVYGSWKETDSGS 
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>RXN00617 TRANSLATE of: rxn00617.seq check: 9185 from: 1 to: 228 

MDEQEALFDRFSRGSQKNSRRPGGAGLGLSIVKAIGEAHVGRAFVNSTPGLGSIFGLEIP 

APEQSKEYTHEQDPAR 

> RXA00630 (1-159, translated) 53 residues 

AKILDNVWHY DFGGDGNVVE SYISYLRRKV DTQDPQLIQT VRGVGYVLRT PRS 

>RXN00631 TRANSLATE of: rxn00631.seq check: 5796 from: 1 to: 1455 

MENPYVAALDDENQEVGVKKEAEKEPEIGPIRAAGRAIPLRTRIILIVVGIAGLGLLVNA 

lAVSSLMREVSYTRMDQELETSMGTWAHNVELFNFDGVRQGPPSDYYVAKVFPDGSSIIF 

NDAQSAPDLAETTIGTGPHTVD7UVSGSASNTPWRVMAEKNGDIITVVGKSMGRETNLLYR 

LVMVQMIIGALILVAILITSLFLVRRSLRPLREVEETATRIAGGDLDRRVPQWPMTTEVG 

QLSNALNIMLEQLQASILTAQQKEAQMRRFVGDASHELRTPLTSVKGFTELYSSGATDDA 

NWVMSKIGGEAQRMSVLVEDLLSLTRAEGQQMEKHRVDVLELALAVRGSMRAAWPDRTVN 

VSNKAESIPVVKGDPTRLHQVLTNLVANGLNHGGPDAEVSIEINTDGQNVRILVADNGVG 

MSEEDAQHIFERFYRADSSRSRASGGSGLGLAITKSLVEGHGGTVTVDSVQGEGTVFTIT 

LPAVS 

>RXN00651 TRANSLATE of: rxn00651.seq check: 9352 from: 1 to: 1332 

MQSSLDRVSETGRNELDVETLVKKGNQPGAMSYRNSIHILTASLLVVGLGASARLTLPMF 

ALSCVLLFVWGFLYFYGSTKRVDLSHGMQLGWLFVLTLVWIFMVPIVPVSIYLLFPLFFL 

YLQVMPDVRGIIAILGATAIAIASQYSVGLTFGGVMGPVVSAIVTVAIDYAFRTLWRVNN 

EKQELIDQLIETRSQLAVTERNAGIAAERQRIAHEIHDTVAQGLSSIQMLLHVSEQEILV 

AEMEEKPKEAIVKKMRLARQTASDNLSEARAMIAALQPAALSKTSLEAALHRVTEPLLGI 

NFVISVDGDVRQLPMKTEATLLRIAQGAIGNVAKHSEAKNCHVTLTYEDTEVRLDVVDDG 

VGFEPSEVSSTPAGLGHIGLTALQQRAMELHGEVIVESAYGQGTAVSAALPVEPPEGFVG 

APVLADSDSSATGEVELSSPTDDE 

>RXN00822 TRANSLATE of: rxn00822.seq check: 6060 from: 1 to: 681 

VEGVQEILSRAGIFQGVDPTAVNNLIQDMETVRFPRGATIFDEGEPGDRLYIITSGKVKL 

ARHAPDGRENLLTIMGPSDMFGELSIFDPGPRTSSAVCVTEVHAATMNSDMLRNWVADHP 

AIAEQLLRVLARRLRRTNASLADLIFTDVPGRVAKTLLQLANRFGTQEAGALRVNHDLTQ 

EEIAQLVGASRETVNKALATFAHRGWIRLEGKSVLIVDTEHLARRAR 

>RXN00826 TRANSLATE of: rxn00826.seq check: 3755 from: 1 to.: 531 
MITVLIDGQSGAGKTTLAGELAARTGFQLVHLDDFYPGWTGLEAASEIVARHVLDADNPG 
FFTWDWHNNCQGDWIKLEPGRSLIIEGSGSITAATKRKASLLGELVTVRITGPEALRKQR 
ALNRDPDYAPFWKVWAQQEQRHFSLGVEVDHEIVLGSDEASGRPEEIYDSLGTAQSS 

>RXA00848 (1-171, translated) 57 residues 

TTVTLAKARS LSLDEALEFC GVDECVEVTP DVLRIRKVIL NATERGRARS RAKSLNK 

>RXN00849 TRANSLATE of: rxn00849.seq check: 7079 from: 1 to: 321 

MVTYTTLLDKPISESAPRKAPEPLLREALGAALRSFRADKGVTLRELAEASRVSPGYLSE 

LERGRKEVSSELLASVCHALGASVADVLIEAAGSMALQAAQEDLARV 

>RXN00978 TRANSLATE of: rxn00978.seq check: 9465 from: 1 to: 615 

MSRSPLTKGLNQLEHLELDKSLTAWSWAEDDPLYLAGENLNGSYLIVAGRVRVSRDTIDG 

KELTVDIATPGDVIGAIDTEPQPAVDSAWAIETTCALFLPATALATVIEQHPSFALAMIR 

MQQQRLATARDHEINLTTTTVEQRVAIAVRTLGRKIGQRRPDGILLIQVRIRREDVAGLA 

GTTVESTSRVLARLRKEGVIDSGRE 

>RXN01081 TRANSLATE of: rxnOlOSl.seq check: 5396 from: 1 to: 750 

MTPANESPMTNPLGSAPTPAKPLLDSVLDELGQDIISGKVAVGDTFKLMDIGERFGISRT 

VAREAMRALEQLGLVASSRRIGITVLPQEEWAVFDKSIIRWRLNDEGQREGQLQSLTELR 

lAIEPIAARSVALHASTAELEKIRALATEMRQLGESGQGASQRFLEADVTFHELILRYCH 

NEMFAALIPSISAVLVGRTELGLQPDLPAHEALDNHDKLADALLNRDADAAETASRNILN 

EVRSALGTLN 
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>RXN01160 TRANSLATE of: rxn01160.seq check: 9223 from: 1 to: 975 

KSSNKISDLARQLNLLPYFTRYKGRTVMEAARDLGQPSSQIMEDLNRLWMCGLPGLLPGD 

LVELDHSFKEVKIHNAQGMDKPLRLTPTEAGVLLLTLESLESLPGIAKQEAVVSAANKLR 

AIMGEYSSTVFDSTGEDLDAEVLEIIRDAMDLHQQVSFEYHSHRSDNTSLRQVSPAHIFT 

HEGETYIKAWEEAVNQWRTFRLDRIRSIVLLDSKAVHPARGVSVSTDDPFEFAKSSDIAT 

LLLREDAMWLGNYMAMEVDETVEPIRDSDGFSWHTVHFPLLSRDWFVRFAIGHAEHLKVT 

SPEDLRKCIKQKAFSGLSAYDHHVE 

>RXN01211 TRANSLATE of: rxn01211.seq check: 9180 from: 1 to: 1257 

MNKDFWTAGWTARWFSRGVSLLASPVTAPLNSWRRLPNLAKYTLYTRVSLQAIPVVLLSA 

YFLGIVANAGTLNPSFVWLLGFSVILLIVTVLVYEYQPSLNSHPRRSVQPFFFTGLVLNV 

LGVVVSVVLQIPGLNMSDNTRATALIFTLTCVFLLSIAYIPWMNYRWVWLIAMSAVLWWT 

STTTDYLSALWVVIPPLMAGTVRLSVWTVDVMKEVERSRELEASLRVTEERLRFAQELHD 

TLGQHLAAMSVKSELALALAKRGDDRLENELRELQKLTRTSMSEMRDVVSGYRTVNLATE 

lEGAKSLLADAHIHLSVIGTTSQVSPAHRELCAWLVREATTNILRHSDATDATLTLSSTE 

VRMDNNGVNKDIGRLSGLSALRSRAESAGMTLIVSREDDQFSVRMLINAPANTPAEKEA 

>RXN01315 TRANSLATE of: rxn01315.seq check: 4618 from: 1 to: 651 

VDIEEQPSLREIKRQMTLEAIEDNATRLILERGFDNVTIEDICAEAGISKRTFFNYVESK 

ESVAIGHTAKLPTDEEREAFLATRHENIIDTVFDLVINLFGNHDNSKSGVAGDIMRRRKE 

IRVKHPELAVQHFARFHQAREGLEHLIVEYFEKWPGSQHLDEPADREAIAIVGLLISVML 

QGSREWHDMPQGTQADFQACCRKAIKNTFLLRGGFSE 

>RXN01349 TRANSLATE of: rxn01349.seq check: 9951 from: 1 to: 777 

MATSRRDAENIDQAGSEFIESDSGHTATPEEVVATALTFFAEDGFSETKLEKIAKASGMS 

KRMIHYHFGDKKGLYIKAVSYALRLLRPEAEAMQLDSAVPVDGVRKIVEALYTCITKHPE 

AVRLLLMENLHSQDSVDSTAAYSDESNVLLNLDKLLMLGQDAGAFRPGISAEDVLVLISS 

LAYFRVSNKVTLKNLYSLDLESEANIEGMKRIVVDTVLAFLTSNIQNSGNSSYLVVGGKT 

AEPETDDSVYSFDTDVFEN 

>RXN01368 TRANSLATE of: rxn01368.seq check: 7182 from: 1 to: 312 

MEDSAGDVSAKLKAGQTRTALEMTLDDLFGAVEQEWQEQALCAQTDPEAFFPEKGGSTRE 

AKRICQGCPVRDECLEFALEHDERFGIWGGLSERERRRLKREIS 

>RXN01445 TRANSLATE of: rxn01445.seq check: 6964 from: 1 to: 993 

MIPLINVRFPVAALPLALVATVWLNAWADHLLLTGFIVYLAVEYATSRGRFALALILGVE 

WILIAYGVALERPLEAKDSPSLITEILLILVAAGTGAGRWKILSERKQRAITQQEIIKKI 

RTDIAHYLHDSMARSLAIMIVQSKLTELEPDPKKIQEKLNSIAKIGQEAVADLHQLVRHL 

VVEESAEKATAFGAWAAVSIHDTVNSAIQLLVDAGHVVSFDSRKKNYKLDHIAETAFALA 

FNEAVCNAIKHSPPKANVTIRITEKAQSLQILVMNPIGDWHANGESAIPGVGIGVESLTR 

RIRNIKGQVCVTSLQGYWKVVISLPLKCEDS 

>RXN01773 TRANSLATE of: rxn01773.seq check: 9269 from: 1 to: 477 
MTVDLYQARIPFQRDGVRFDHTMITHIQAGLHLGGCRAAGLLPIPAHIDHIVRLTAADFY 
DTQSAPQLLSNTVLDVLDTTTQDLKALWPVAEHIATTIPESENVLIHCQMGINRSAALMT 
RVLMLRNDCTADEAIALLRDRRSPFVLFNEHFVEQLE^L 

>RXN01845 TRANSLATE of: rxn01845.seq check: 6514 from: 1 to: 408 
MISNSWAIETTCALYLPVEALAEVVDAYPQLALAIMRMQQDQLVRSRERETAQTTSTVEQ 
RVAAALQHLDAKLGQIRQDGSSLLQVRLRRDDVAGTTVESASRAMARMKKTGVIDSGREW 
I AI TNHQALADLVAGL 

>RXN02097 TRANSLATE of: rxn02097.seq check: 7698 from: 1 to: 3372 

MPAGIADMTDSLLGWASQTELDLNQRLAGVEYFPQIQLRHDELERIHRFYGTFLSRQVGA 

GASLGDLFEMTPCLTVTTLVSRASRISDPADFFGEYIGGLGLSAEHAAVVEGLTEKLFAQ 

AGLLVPEGIASPLELLSIHAGISNHEVAAVLTEVENGTTEYPFMFDAVLRLTPEWAQTLI 

GGVQELIEFATTHRTSWSDRQRESSLPAMIDEIVVAELRERPVGTADRENSVGVALRELR 

PRLILDAERRKVCLRLPEQRVSDDEINWRVSLEGTTRIFSTRRAWGDTSGYSEALDITVE 

RQIRETTVTDTSNQITWVVPVVDFNDPVLVFSARGENLTDKVSLHHQEIYVLAPAEAKLE 

DMVTGQPVPVIEQFLVEGWNSWVCSRVDARGLSSLKVNKEVRCIDPRRRVAFHHPAELVP 

HVRSISGLPVHAQSLIAEFPPTLSGQDETWMLSISAFAGVGAAGEEIAEPEPLEVPADGG 
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LFAIFDPEIYDAPWVGEYLVRLRGPRNESFRPEFAIVEDMTTEFEVASGASFRIPTTTGL 
SEASLRVRSGEKHFTAEPRLVTVEATDPNASFVVTTDEGDQMPLRFVPPQIAIELPLTTE 
PPTWRVTRTVCGPRDLDGAGELRIRTGVDVGDPKVSVRNHHGSPLRTVKMVTPDNGRTWI 
ASMKEIAASTFVMPRGSIEFEWTDRKVDRRVSVTIAVIDKTENFTGITIEDGKLVFEELA 
AGRQLAAWVWPQTAPWVSAVELAVTGPELELPEVLVGAGNLIVQLHTADPFTTSVTPLSP 
GKAAVTVEQEGYYSAQTEEYAQLSAFFGGEVEEPPISDAVVPALWDVSHIWTEQGNTEHL 
PVVHAALRSSPAAALKGLSASLVPAQALPGKVISSGLAASPFTTESPATEVHRTAWIGTL 
QLLGALPSAFKEAEELGNRTPLLPILGQLEEVAGKNILSTLATGRDSTLDTACIDQjSTVA 
lAGMNETQQKALLDMFFSNADIVPGPLMEDNTRLMAVFETFKKRDALREVLQTEGLIKTA 
VELLRAMRGTQRQLYSSARIRFDKLDGVNTDNPENMWALTPVVSLVFALSSRLHAHELIG 
KTRTLDRASAGWGRIADLVPDLVTGDLISAEAMVLGARNPGLVD 

>RXN02266 TRANSLATE of: rxn02266.seq check: 6363 from: 1 to: 513 
MTQDEHPRQADSHFNMLLPDGNENAHQLSVALNQVAHLLAYDADSSIHRPDGLSLASYRI 
LFSLWTDGPMSPLQVTDKTGMKKSAISNLLKPLLAESLIVQVTAENDRRSKVLSLSEKGT 
TYIQKTATRQNALESEWFGTLTDIEQDLLESLLRKLLDSNRASKVRKNRSN 

>RXN02270 TRANSLATE of: rxn02270.seq check: 1360 from: 1 to: 621 

MDQARPNRTHYAMVELEQHGFLSGVVTQNVDGLHAEAGTKNLVALHGDLAHVMCLNCGFG 

EDRHLFDERLEAANPGYVASIRLEPGAVNPDGDVFLDEEQVRRFTMIGCLRCGSLMLKPD 

VVYFGEPVPAARKKDLKKLLDASSSLLIAGSSLAVMSGYRIVIEAQRQGKQVSVINGGPG 

RADSRVDILWRTRVAPAFDDILDALDL 

>RXN02362 TRANSLATE of: rxn02362.seq check: 8315 from: 1 to: 3699 

VTISRRLKQERSFADDLQDLKTLNDQLRFTNAKLQARISGIGNDGKKITRPTPLLALDFQ 

LTVEEYETIIAILVEAVGGNQSKPAILKDLFIEYPLVFLAALSGTAMLDAQEGFWPAFWK 

RTQVSVPEHVYDAIRKELVNSIRKNGLETFSLADLNRREYVGLIQLHSGLSAKDMLALVK 

FIDHTRAENQGWDSGEDFASYAKSVFSSGDNLLTTESLKQLVTHIPARSVDFIARVYELT 

NWYRDLKDLNEVEAFVGTHGLPELSFKFLLECLSGEAEQIAEKTKAAPASLENLEPPHLY 

LDPQSFELSLVFPAISKTAALQIPAPEWTVIYDGNSIKVRPEQDWSYGGFAEYRLPLDKP 

LSSLRVITPTEKSLILIEGFGHKNPIMFFKNNGQPYANQEMLSGNAVTAIVPAAAIIRAR 

MRASKTFNYQDLGPLSGWNKWVIRSIPLKRAESITVSHGGFRKELPVRRKVDVQWITEDL 

TIENLQGLDHEPVFHTSPRIEFPTSGSNWVIQYSQILPDGSLIEMEDYPVEPENFGYELD 

LFEESDDPWVGQFLVTLLKDEKVYETRKFNLAEGLDLSLTFSGGGPENRFRYPSINQGQT 

GLTKTFARFSSNSEKHIRFPDEIIGLDAFTSQKAFNIASGDFPEDYNLDVFITPPQLHYQ 

VPVTHSQTKWESTKTTLDFNDFADGNLQIRFPNEVYDPNLKIIKMVAYKKPESSEPKYLS 

KIGSSKVWSIPMDRIKELMDDDAQFLLIAEWFAESKDQHREKIISEAKRTGKISNAALKS 

ARPQPQASSHIATIEKKPLLAAAEIKLSTVELELGRHTSKRLEGWAWSALNPLDPPIKVD 

FQGTSGSLPDTHFVVGPLIVEVREKEFLSQWQPKVPSVKAVVANDPSFELDPQFDPFLTH 

RWMFAPRSGKVLLPQEIRTVWDARFNMRHVLAQRENLHVKSIQDFDDATSTYLTSDPRVA 

LDELDKSSIPSNSHFESFIRSGLAELSFEVDDTAGDIHRVPWIGLIQEMNDLRILQIQGY 

ETEERAIERRNSQSYIREIGGSELWNILKGNSEGLSLAQKCAPQATEINVIRNSGLETUyiR 

NGLGADQFSAEFISADSRLRAQLEWLENRRELNDLGQLPTLFDFAEKYEYLIDHLGDDRI 

KVTARELSTLASEHRRGNAENWLYAPYVSFIYSLLNRMIAHEVIRPIAQINYSRHDWANA 

ARLI PRLTGFDLVSAEAKVLSAINNNNI I PTAI 

>RXA02365 (1-540, translated) 180 residues _ ^„ 

MELGQTKRGQ MGGMDYLSED RVELRYTMPL GEIIFDFFDM LKSRTKGYAS LNYEEAGEQT 
ADLVKVDILL QGEPVDAFSA IVHRDNAQWY GNKMTVKLKE LIPRQQFEVP VQAAIGSKVI 
ARENIRALRK DVLAKCYGGD ISRKRKLLEK QKAGKKFLMKN IGSVEVPQEA FVAALSTDEA 

>RXN02450 TRANSLATE of: rxn02450.seq check: 5093 from: 1 to:: 555 
MNLKDLKAAETRQRFIDVAHELFLEHGYGSTSMNQIAQAAGGSRANLYLHFRNKPDLMMA 
KMRELEPAVRTPVLKVFDLPEHTLESILRWLDSMTEVWKANAICVFGAMEQAMVEDAAVAb 
EWLSMMQRLSQSVPELVENEERRVQFLASLMGMDRNFYFLYVRGQDVDEELLKLAVARQW- 
LAVFQ • 

>RXN02493 TRANSLATE of: rxn02493.seq check: 2568 from: 1 to: 1239 
VSTLLAFVLGVVLMGLALPAYTKIKDRMRRHKSAVTLSENQVTTVGQVLHLAIQGSPTGI 
TVVDRTGDVILSNGRAHELGIVHERSVDGNVWRVAQEAFQDQETHSLDVHPDRNPRRPGS 
RITAVQAVVKPLTLIDDRFVIIYASDESENVRMESARRDFVANVSHELKTPVGGMALLAE 
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ALMESSDDPEQVEYFGSRLHREAHRMADMINELISLSKLQGAERLPDMEPVQADDIISEA 
I ERTQLAADNANI EI I RGDRTGVWVEADRSLLVTALANLI SNAIN YS PKSVPVSVSQS IR 
NDVVMIRVTDRGIGIAPEDQGRVFERFFRVDKARSRQTGGTGLGLAIVKHVMANHGGSIS 
LWSRPGTGSTFTLELPVYHPESKEPAGSKQGPSLDSPIRTTASKASGRRKEKS 

>RXN02506 TRANSLATE of: rxn02506.seq check: 9674 from: 1 to: 882 

MHLNQLEFFIAVAQHGQINRAAEELLISQPALSRQISALEKSVGAPLFERHSRGVSLTKA 

GEILHEEALRTLSRMQSVVDEIQSGEHLITSINIGVPPGIPIDWLRCQLIDLGPETRISL 

lESPTDDQLKLLKQRELDIALCRRQSEAFATTLVHEQELGIVVRKNSELHQKVAGKDNAT 

LFDLEGLRVLAHSRGEVRIQEEILKNAMLAAGVNATWIFRKFGQYSSLIADLVQADVALT 

TEESARTNFPSWQWVPIEGEDASGNDLVVRTWITWNPQPTPAVKALIQKFIDGN 



>RXN02553 TRANSLATE of: rxn02553 . seq check: 2092 from: 1 to: 564 

MAVKRNELEPELTSNPNPLSAEVHHLYPEETRLATEILERTNNWLAEKGIPPLPPAEVVA 

ISLHLVNAGFRTEDLAETYVMTGVFEQLFEVIDSSFGITLDRQSVNAARFITHMRYFFVR 

VHHDGQLNDGMSVLRNSLEISHPDSVACAERLSQILSLRLGAELSSDEQTYLALHVARLA 

EDRGTTAD 

>RXN02620 TRANSLATE of: rxn02620.seq check: 2192 from: 1 to: 666 

MAGAVGRPRRSAPRRAGKNPREEILDASAELFTRQGFATTSTHQIADAVGIRQASLYYHF 

PSKTEIFLTLLKSTVEPSTVLAEDLSTLDAGPEMRLWAIVASEVRLLLSTKWNVGRLYQL 

PIVGSEEFAEYHSQREALTNVFRDLATEIVGDDPRAELPFHITMSVIEMRRNDGKIPSPL 

SADSLPETAIMLADASLAVLGAPLPADRVEKTLELIKQADAK 

>RXN02758 TRANSLATE of: rxn02758.seq check: 5860 from: 1 to: 1299 

VTELIQNESQEIAELEAGQQVALREGYLPAVITVSGKDRPGVTAAFFRVLSANQVQVLDV 

EQSMFRGFLNLAAFVGIAPERVETVTTGLTDTLKVHGQSVVVELQETVQSSRPRSSHVVV 

VLGDPVDALDISRIGQTLADYDANIDTIRGISDYPVTGLELKVTVPDVSPGGGEAMRKAL 

AALTSELNVDIAIERSGLLRRSKRLVCFDCDSTLITGEVIEMLAAHAGKEAEVAAVTERA 

MRGELDFEESLRERVKALAGLDASVIDEVAAAIELTPGARTTIRTLNRMGYQTAVVSGGF 

IQVLEGLAEELELDYVRANTLEIVDGKLTGNVTGKIVDRAAKAEFLREFAADSGLKMYQT 

VAVGDGANDIDMLSAAGLGVAFNAKPALKEIADTSVNHPFLDEVLHIMGISRDEIDLADQ 

EDGTFHRVPLTNA 

>RXN02910 TRANSLATE of: rxn02910.seq check: 4179 from: 1 to: 705 

VEIRWLEGFIAVAEELHFSNAAIRLGMPQSPLSQLIRRLESELGQKLFDRSTRSVELTAA 

GRAFLPHARGIVASAAVAREAVNAAEGEIVGVVRIGFSGVLNYSTLPLLTSEVHKRLPNV 

ELELVGQKLTREAVSLLRLGALDITLMGLPIEDPEIETRLISLEEFCVVLPKDHRLAGEG 

VVDLVDLAKDGFVTTPEFAGSVFRNSTFQLCAEAGFVPRISQQVNDPYMALLLAR 

>RXN02946 TRANSLATE of: rxn02946.seq check: 8998 from: 1 to: 459 
MTTEAPIWPAELFEDLDRNGPIPLYFQVAQRLEDGIRSGVLPPGARLENEISVAKHLNVS 
RPTVRRAIQEVVDKGLLVRRRGVGTQVVQSHVTRPVELTSFFNDLKNANLDPKTRVLEHR 
SLQQVPPSQKNSEFPQVTKSSSSAASAPPETSP 

>RXN02954 TRANSLATE of: rxn02954.seq check: 5707 from: 1 to: 738 

MSAALPHTAADPVHTTPAKPLLDHVLDSLGRSIISGEMEAGSTFKLQDIGEKFGISRTVA 

REAMRALEQLGLVASSRRIGITVLSHEHWAVFDKAIIRWRLEDERQREQQLQSLTELRIA 

lEPIAARSVALHASSAEIAIIGDLAARMRNLGEAGRGASQEFLDADVKFHELILQYCHNE 

MFAAMAPPIPCAVLVGRTTLGLQPDRPAEEVLDNHDALAHALSVRNADLAEKASRSILNEV 

RDALTS 

>RXN02990 TRANSLATE of: rxn02990.seq check: 3521 from: 1 to: 597 
MDIQAEKIEKLRKALDNFERAHARGESDFFDHEKEEKKANVRRRALLLLNQRARSVNELS 
TRLKALEFEEDIINEVIGDLTRSKLLDDEVFATEWVRQRAARRGKSSRALDRELQEKGVD 
KQTRAAALEQI DQADERDTARAVAVKKARSETKI PQDRADYDKALRRVVGALARRGFPAG 
MSMDLAREALDARIEDLKN 

>RXN03023 TRANSLATE of: rxn03023.seq check: 7800 from: 1 to: 357 
VPLYKQIASLIEDSIVDGTLSIDQRVPSTNELAAFHRINPATARNGLTLLVEAGILYKKR 
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GIGMFVSAQAPALIRERRDAAFAATYVAPLIDESIHLGFTRARIHALLDQVAESRGLYK 

>RXN03071 TRANSLATE of: rxn03071.seq check: 1804 from: 1 to: 
MLAPWQLHKDDDIVARNEQITEAFERDVVPYAELFDASGQIPSSQEFFRVSLTGQYLPDS 
EVLLRLRPVDSGPAFQSLTPFELENGQIVLVNRGYESSEGTIVPEIEPAPSHQ 

>RXN03072 TRANSLATE of: rxn03072.seq check: 9857 from: 1 to: 
MEDSGYTQVYGINTEQISDVTGLDLGTDYVQVAEGEPGVLNPMPLPQMDRGNHLSYGFQW 
lAFGIMAPLGLGYFIWAEMRERRRDKAEREQMAELNTLEPVVETPEVVETAEPTITPAAS 
KRRSRYGDQHRNHYEKISKRDQERF 

>RXN03090 TRANSLATE of: rxn03090.seq check: 1334 from: 1 to: 
MAKSTPLIASLRWRIVLWMTAVVFLTLASVVIITRSVLLSEVTNTANSAVEQEIEEFRRF 
AAEGIDPTTAQPFESGHRLMEVYLSRQIPDENEAIVGIFPGELIQVDYSQLSGAHPLPLE 
HSDPLISEIRQTTLNSGVFSDLERGTTHWGKVNFQTASGEADGEFVVAFFADNLKDQVNG 
QIQILILIGTGGLIASILIAWLIAGQIIAPIRKLSSVSAKISNSDLTWRVPVEGRDEIAQ 
LARTFNAMLDRIEIAYNDQRQFVDDAGHELRTPITVVRGQLELLATTPPEEQARSIELAT 
TELDRMSRMVNDLLTLAVADSGTFIHAHPTDVTDLTIDIE'DPCARTISDRILLVDARPRAS 
SASTSSGSPRQCLELFGNALRYSDDVVELGSGFQGVWPPPHFSHLGS 

>RXN03100 TRANSLATE of: rxn03100.seq check: 173 from: 1 to: 318 

LYGQDKVTSDPMEAAYTSLYLWKEMVEKADSFDVAAIQAAADGTTFDAPEGTVVVGGDNH 

HISKTPRIGRIRPDGLIDTIWETDSPVDPDPYLSSYDWAKTTAATS 

>RXN03127 TFLANSLATE of: rxn03127.seq check: 8598 from: 1 to: 720 

MESSKKTSRSRSTTQEAVRDIKKYIRDNRLRTGDLLPSEAFLCEELGCSRSAIREAIRAL 

VTLDIVEVRHGYGTFVSRMSLEPLINGMVFRTVLDNDTSVENLFYVVDTREILDLSLGEE 

LIEVFTDDDRELLLDLVDKMREHNDQGESFVVEDQKFHRALLARTKNPLIRELNDAFWQI 

QTEAQPMLNLAMPADIDETIKAHSDIVEALSSGNIDDYRSAVLAHYAPFRRMISNMLDAH 

>RXN03136 TRANSLATE of: rxn03136.seq check: 9294 from: 1 to: 2415 

LGAHSANSIRGVIDRLDASTVVIVADVHWADVESMQKLIEYSMRMVSGRFALIMIGLDEE 

NLVFHDEVVSLPSIADSTYVLPPMSIEEIRQLALTDVRGRISTTTATDIQRITGGIYGRV 

KEVLHSESPDHWRMPNPNIPIPQSWHANLLRRITNEEVWHVLLAVAVLPSGGPIDLVKLI 

GNDPTGMLCDDAVRSGLLRVLPSDGQPQVDLVLPIDRAVLQSRTPLNILAQLHHKAAEYY 

GKWNQKDAQLEHEAFAAIDPNDPAVRALAQRGYALGRTGHWMESAHALSLAANRTAHQEE 

SNKYLLESIDSLIAAADLPQARSRASTLDLGETGIQQDSMLGYLAIHEGRRLEARNLLHR 

ASEELLAQHPIDPIHGPRMAQRKVLLNLVDWNPEELLVWADRAVAWTEEDAGEKVEAQAI 

SLIGQSILDGCLPEDKPIPGETTLHAQRRHiyiAMGWLSMVHDDPVTARQKLERRTSINGSE 

RISLWQDGWLARSLLLLGEWESAARTVEIGLARAEQFGIRFLEPLLLWSGATIATARGNS 

DLARNYMSRLSTDQDSFIVQSMPSAMCRMWVHRHRNEIPGAIVAGEQLEKIAAHKHVNAP 

GFWPWQDVHATHLIRIGETERAQELVNSTLEELRGSDIMSAHAKIAVPDAMLMIHHGDVK 

KGFKRFDDALDMIDPLTLPYYRARICFEYGQALRRQGQRRRADEQFARAASLFQDMGADA 

MVTLANRERRVGGLGQRSEQAGGLTPQEYEIARLVSSGHANREVAQELFLSPKTVEYHLT 

R V Y KKLG I RN RME L AE ALKK Y S H D A 

>RXN03143 TRANSLATE of: rxn03143.seq check: 755 from: 1 to: 1131 

VKTSQATIARIERVLIWGLHLLIAVLLVLVCWRASHWGVWVLAFGYGVVYVAGVVPNSPF 

KNHPMAWFLVLSLLWASLIWDGPEPAYLVFPMFFLAVLITTPLKSAIIIAILTAIAVVTL ' 

AMHLGFSVGVVTGPILGALVAWVMGTCFQLLAQALKELVDAEIASAIRASKSAGEQAEEIAR 

lAGEIHDTVAQGLSSIQMLLHAAEKRVDDPQALSHIRLARQTTADNLAETRQIIAALQPT 

PLIGADLPVALARLSSTTPMGQNITFEVDGSPRVLPDAMEAEIVRIAQTLLGNVVRHAQA 

DSAKMTLTYQDDQILLDVIDNGQGFDVAEVIRKKSIGLPTAQRRAEGLGGTIIIESTIGS 

GTGISARFPYPQKDQDK 

>RXN03155 TRANSLATE of: rxn03155,seq check: 4295 from: 1 to: 1668 

GYPPPPTASKDAAGGLPQLIRELLDATPIDHWSNDRPTLTLPEHWVTDIDIKNPVLREVA 

SHPFFDGCPIGDLDADAFVEDGTLIHENGTLRFRSPEERTLVRASTPPSMARSPREWEST 

EGGVDKLIAAGNLPLARLHVEELPRADEQRAFLALYGGQSFEAASASPFYALATWNPEAL 

RGDPTFDMFADALDTGHYREVPRPDAPEESQIHDFISGWLALVYDDPLTARRLLSSRGPS 

DLVGLWQSAFLARAHYVLGEFQEASAVVERGLATGDRTGASLLEPVHLWTGAQVAAMTGR 




339 



435 



1221 
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TELANHYLQRLTVPDDAFLIQKLSASMGKLITASMTSDTRAATLAGDRMASVVYTTNTQQ 
PGFWAWEDMYAISLIRTGRIDAAAAVMDGIPDSTIPSLRARNLVPQANIEIQRGSTARGV 
KMLSEAVDLISSVNMPAYEARILFEYGLVLRRMGRRSQAAEMFTHAEEVFTAMGAVTLAA 
RCHGERRVAGVGPRRSAQGLTPQEEQITALVVDGCSNQEVARELSLSAKTVEYHLTRVYK 
KLGVSSRGELRELLKV 

>RXN03181 TRANSLATE of: rxn03181.seq check: 5678 from: 1 to: 414 
VETQAFQRQNTGLI AMVAADASNPFFLEI FRGAQHAASTQGYTVALVDARESAI KSREVL 
DKIVPHADGLLLAASRMDSGEIHKVAREIPTVLMSREVQGIPSVMVDNYDGAPKAVVHLV 
DQGCRSITYIAGPNKSWA 

>RXS00070 TRANSLATE of: RXS00070.seq check: 7466 from: 1 to: 432 
VGINRISQGSAPKLGVRSTRQRFCAVIDVLEEIDNFASAKEIHHELSTREHNVGLTTVYRTLQSLADIGA 
VDVLTVTGGETLYRQCHDEGHHHHLVCTNCGRTVEIDGGPVETWAQEIATKNGFALSSHEAEIFGLCAD 
CKEKVT 

>RXS00133 TRANSLATE of: RXS00133.seq check: 4189 from: 1 to: 813 

MLFVRRLTSLKTATGIPVTMFATVLQDNRLQITQWVGLRTPALQNLVIEPGVGVGGRVVATRRPVGVSD 

YTRANVISHEKDSAIQDEGLHSIVAVPVIVHREIRGVLYVGVHSAVRLGDTVIEEVTMTARTLEQNLAI 

NSALRRNGVPDGRGSLKANRVMNGAEWEQVRSTHSKLRMLANRVTDEDLRRDLEELCDQMVTPVRIKQT 

TKLSARELDVLACVALGHTNVEAAEEMGIGAETVKSYLRSVMRKLGAHTRYEAVNAARRIGALP 

>RXS00144 TRANSLATE of: RXSO0144.seq check: 1916 from: 1 to: 576 
MSERNSAVLELLNEDDVSRTIARIAHQIIEKTALDSKDADRVMLLGIPSGGVPLARRLAEKIEEFSGVS 
VDTGAVDITLYRDDLRNKPHRALQPTSIPAGGIDNTTVILVDDVLFSGRTIRAALDALRDVGRPNYIQL 
AVLVDRGHRQLPIRADYVGKNLPTARAEDVSVMLTEIDGRDAVTLTREDSEGDS 

RXS00205 TRANSLATE of: RXS00205.seq check: 3895 from: 1 to: 1107 

MASETSSPKKRATTLKDIAQATQLSVSTVSRALANNASIPESTRIRVVEAAQKLNYRPNAQARALRKSR 

TDTIGVIIPNIENPYFSSLAASIQKAAREAGVSTILSNSEENPELLGQTLAIMDDQRLDGIIVVPHIQS 

EEQVTDLVNRGVPVVLADRSFVNSSIPSVTSDPVPGMTEAVDLLLAADVQLGYLAGPQDTSTGQLRLNT 

FERLCVDRGIVGASVYYGGYRQESGYDGIKVLIKQGANAIIAGDSMMTIGALLALHEMNLKIGEDVQLI 

GFDNNPIFRLQNPPLSIIDQHVQEIGKRAFEILQKLINGDTAQKSVVIPTQLSINGSTAVSQKAAAKAA 

KAAQECAAAKAAQNTQHEVSLDGEL 

RXS00470 TRANSLATE of: RXS00470.seq check: 9539 from: 1 to: 1269 

MGESPEKVAFRVFPDGLVSQGHDMIEDMSNTPAPYTPQPAGQAVPLYPTFTRSRDGRVVAGVASGLAKH 

LNVSVFWVRALLIFAALLSGAGLFAYALIWIFTRIEKKGSGEASTSKRWVSWCLVLLAIGGAAASVMLS 

TGFAVGTLVPIGVVGVGLLMVWLAYDRGVESGPNLLIIATGGVLMLVAIVLIVMNWNTQDGFVMALVAV 

VLTLVGVAALGVPLWVRMWDQLGEERAEKAAAAERADIASRLHDSVLQTLALIQKRADDPAEVARLARG 

QERELRQWLFDSQDKTPQTTGTVFTALERACGEVEDIYALRIVPVTVGTDEALTEKTQAAVMAVREALV 

NVAKHAGVETADVYAEIMLGELNIFVRDRGAGFDPDNIPDGHHGLAESVQGRVERAGGKVRIKSEIGEG 

TEVAITMDV 

>RXS00471 TRANSLATE of: RXSO0471.seq check: 3433 from: 1 to: 690 

MVDVFLVDDHSVFRSGVKAELGNAVTVVGEAGTVADAVAGIKASKPEVVLLDVHMPDGGGLAVLQQIND 

SDVDTIFLALSVSDAAEDVIAIIRGGARGYVTKSISGEELIEAINRVKSGDAFFSPRLAGFVLDAFAAP 

DSAAGAGIVDAPEKDAAVESGKILDDPVVDALTRRELEVLRLLARGYTYKEIGKELFISVKTVETHASN 

ILRKTQQSNRHALTRWAHSRDLD 

>RXS00481 TRANSLATE of: RXS00481.seq check: 4415 from: 1 to: 585 
MLNMQEPDKIHPAEPTLRNIYDVKTSDPKSELVDRSGMSEEDIAQTGRLMKSLASLRDVERSIGEASAR 
YMELSAPDMRALHYLIVAGNAGEVVTPGMLGAHLKLSPASVTKTLNRLEKGGHIVRNVHPVDRRAFALM 
VTDATRGEAMRTLGKHQARRFDAAKRLTPQEREVVIRFLQDMAQELSLNNAPWLNTE 

RXS00649 TRANSLATE of: RXS00649.seq check: 7418 from: 1 to: 456 
MSTDPIAALEYESTIFARHRNQYTGQAGTNAGVLDSSGYNLLTLLQLRGPSTIGELSAITGLDASTLNR 
QTKALLTKGFVERIPDPDGGIARKFHPTDLGNELLNEERTSSQEKYAELLSDWPEEDLRTFVKLLEKLN 
KAVETRVGKHWPRP 

>RXS00650 TRANSLATE of: RXS00650.seq check: 1698 from: 1 to: 636 
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MIRVLLADDHEIVRLGLRAVLESAEDIEVVGEVSTAEGAVQAAQEGGIDVILMDLRFGPGVQGTQVSTG 
ADATAAIKRNIDNPPKVLVVTNYDTDTDILGAIEAGALGYLLKDAPPSELLAAVRSAAEGDSTLSPMVA 
NRLMTRVRTPKTSLTPRELEVLKLVAGGSSNRDIGRILFLSEATVKSHLVHIYDKLGVRSRTSAVAAAR 
EQGLL 

>RXS00657 TRANSLATE of: RXS00657.seq check: 8495 from: 1 to: 903 

MSTEDIVVVAVDGSDASKQAVRWAANTANKRGIPLRLASSYTMPQFLYAEGMVPPQELFDDLQAEALEK 

INEARDIAHEVAPEIKIGHTIAEGSPIDMLLEMSPDATMIVMGSRGLGGLSGMVMGSVSGAVVSHAKCP 

VVVVREDSAVNEDSKYGPVVVGVDGSEVSQQATEYAFAEAEARGAELVAVHTWMDMQVQASLAGLAAAQ 

QQWDEVERQQTDMLIERLAPLVEKYPSVTVKKIITRDRPVRALAEASENAQLLVVGSHGRGGFKGMLLG 

STSRALLQSAPCPMMVVRPPEKIKK 

>RXS00686 TRANSLATE of: RXS00686.seq check: 3127 from: 1 to: 804 

MAGGNREPGRTVTSKVIAVLGAFEHTMRPLGVTEIAELADLPPSTTHRLVSELTEGGLLSKKSDGRYQL 

GLRIWELAQNTGRQLRDTARPFIQELYSLTSETAQLVVRDKDEALLIDRAYGTKKIPRSARVGGRLPLN 

STAVGKILLAFDEPWVKQSYLKLPLNASTPKTIVNPDVLAAQLKQIHSQGFAITHDEQRIGGASIAVPV 

WHTGKLGAALGLVVPTAQAANLERYLPILQATSQRITKATALIPLDTLLASHKNAERKGDT 

>RXS00719 TRANSLATE of: RXS00719.seq check: 7090 from: 1 to: 1629 

VTDKHTMPGEEDDTVFVYHTHKGEMDVEGAFADEEELAPHGGWASADFDPAEFGYEDSDDDFDAEDFDE 

TEFSNPDFGEDYSDEDWEEIETAFGFDPSHLEEALCTVAIVGRPNVGKSTLVNRFIGRREAVVEDFPGV 

TRDRISYISDWGGHRFWVQDTGGWDPNVKGIHASIAQQAEVAMSTADVIVFVVDTKVGITETDSVMAAK 

LLRSEVPVILVANKFDSDSQWADMAEFYSLGLGDPYPVSAQHGRGGADVLDKVLELFPEEPRSKSIVEG 

PRRVALVGKPNVGKSSLLNKFAGETRSVVDNVAGTTVDPVDSLIQLDQKLWKFVDTAGLRKKVKTASGH 

EYYASLRTHGAIDAAELCVLLIDSSEPITEQDQRVLAMITDAGKALVIAFNKWDLMDEDRRIDLDRELD 

LQLAHVPWAKRINISAKTGRALQRLEPAMLEALDNWDRRISTGQLNTWLREAIAANPPPMRGGRLPRVL 

FATQASTQPPVIVLFTTGFLEAGYRRYLERKFRERFGFEGTPVRIAVRVRERRGKGGNKQ 

>RXS00738 TPIANSLATE of: RXS00738.seq check: 6764 from: 1 to: 363 

CQEETDGFFDFGRDMRPGERRSYGTLLNDATTQVSHILGNAFTRSGLNAEYANLYGQALVGMVSMTAQW 

WLDERTPPKEEVAAHIVNLCWNGLTGMEADPKLTPISSAEGAIFGQEKESEA 

RXS00774 TRANSLATE of: RXS00774.seq check: 6151 from: 1 to: 654 

MDKATDALLRTSLASAESALGNAEKLEELRTGCESQAVELLALETPVARDLRQVVSSIYIVEEITRMGA 

LAMHVANSVRRRYPDPVIPEDMRGYFKEMARLAADMTDHIRQILIDPEPDLALEMAKSDDAVDDLHQHI 

MRILTLRPWPHDTKSAVDLTLLSRFYERYADHTVNVAARIIYLSTGLHPEEYMEKREQQRADADMEKRW 

AELERQFRTSE 

>RXS01082 TRANSLATE of: RXS01082.seq check: 2555 from: 1 to: 660 

LTQWGNSNVVEDYLTALFEIAEEWDEEPTTGKLAEVIGVTASTVSATLKKLNPEGFVNYRPYGDIELTPA 

GRDIAINVIRRRRIIETYLSEKLGLGAHELHGEADLLEHAVSPLVLEKMFQAVGYPTLDPHGDPIPTES 

GEMTINDGLMLLGLKAGASATVTRVRDGNPSVVRYLTGVGITVGTTVTVVEALSDIATLRLQIGEMFQD 

IPLAVANAVRVSR 

>RXS01123 TRANSLATE of: RXS01123.seq check: 5460 from: 1 to: 447 
MRTLAAELNIKAPSLYKHVKTREDIAAHIATKAFIQLGQSLHEHCESVEDLLAEYRSMARENPNIYRLL 
TSSEFPRELLPEGLETWAGTPFYLVTGHDPIKGQALWAFAHGMAILEIDARFAGPNNGSPADGVWEIGA 
RAFDTQVFDQG 

>RXS01189 TRANSLATE of: RXS01189.seq check: 2136 from: 1 to: 609 
MISISIADDEALIASSLATLLSLEPDLDVRPTAGSGEELIETWADPSNRTDVCVLDLQLGGIDGIDTAT 
RLMETTPDLAVLIVTSHARPRQLKRALAAGVLGFLPKTSTADEFATAIRTVHAGRRYIDPELAAMTISA 
GESPLTNREEEVLELAGQGLSAEEIAVAAHLAPGTTRNYLSQAMTKVGAQNRFEAFTRARELGWL 

>RXS01242 TRANSLATE of: RXS01242.seq check: 954 from: 1 to: 777 
MYAEERRRQIASLTAVEGRVNVTELAGRFDVTAETIRRDLAVLDREGIVHRVHGGAVATQSFQTTELSL 
DTRFRSASSAKYSIAKAAMQFLPAEHGGLFLDAGTTVTALADLISEHPSSKQWSIVTNCLPIALNLANA 
GLDDVQLLGGSVRAITQAVVGDTALRTLALMRADVVFIGTNALTLDHGLSTADSQEAAMKSAMITNAHK 
VVVLC DS T KMGT D YLVS FGAI S DI D VVVT DAGAPAS FVEQLRERDVEVVI AE 

>RXS01607 TRANSLATE of: RXS01607.seq check: 90 from: 1 to: 630 
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VIRILLADDHPVVRAGLASLLVSEDDFEIVDMVGTPDDAVARAAEGGVDVVLMDLRFGDQPGIEVAGGV 
EATRRIRALDNPPQVLVVTNYSTDGDVVGAVSAGAVGYLLKDSSPEDLIAGVRDAARGESVLSKQVASK 
IMGRMNNPMTALSAREIEVLSLVAQGQSNREIGKKLFLTEATVKSHMGHVFNKLDVTSRTAAVAEARQR 
Gil 

>RXS01674 TRANSLATE of: RXS01674.seq check: 1368 from: 1 to: 894 

MDNGWPNLQTLALFVAIVEEGSLGAGARKVGMAQPNASRAIAELEADMKAELLVRHPRGSHPTAAGLAL 

VEHSRDLLQSVQEFTEWVTEGRTEQPLKLHVGASMTIAEALLPAWVADMRTRFPACRVDVSVMNSSQVI 

EAVQKGHLQLGFIETPHVPVRLHARVVQEDKLIVVISPNHEWANRTGRISLRELSETPLIVREVGSGTR 

EALQELLADYDMAEPIQVLNSNAAVRVVVEAGAGPAVLGELALRDHLALGRLLSVPFEGSGVTRPLTAV 

WSGPRRLPILAGELVSIASNHI 

>RXS01872 TRANSLATE of: RXS01872.seq check: 8549 from: 1 to: 828 

MGNDGGDLRIDDLRSFISVAQSGHLTETAQRLGIPQPTLSRRISRVEKHAGTPLFDRAGRKLVLNQRGH 

AFLNHASAIVAEFNSAATEIKRLMDPEKGTIRLDFMHSLGTWMVPELIRTFRAEHPNVEFQLHQAAAML 

LVDRVLADETDLALVGPKPAEVGTSLGWAPLLRQRLALAVPADHRLASFSGQGELPLITAAEEPFVAMR 

AGFGTRLLMDALAEEAGFVPNVVFESMELTTVAGLVSAGLGVGVVPMDDPYLSTVGIVQRPLSPPAYRE 

>RXS02117 TRANSLATE of: RXS02117.seq check: 9965 from: 1 to: 474 
VSTDPEEFDQAETLDQLAYEIILLTRYGVQNTPTNKREAIMDRSALILLTRLDAQGPMTVNELAESFGL 
NVSTVHRQLKAAIANGLIEVVDDQACPAKLHRPTELGKEKLQQELLARQQDLTRILHDWDEEDIKTHAK 
LLRKHNESLEEYLDMKWPRP 

>RXS02288 TRANSLATE of: RXS02288.seq check: 9420 from: 1 to: 846 

MSQVIPASSQEKRRERIVSYVTRHGFARVEALAELFEVSAMTIHRDLEALAADNLVERIRGGARSVSPS 

MSELAVEQRRHLHRTVKEALCTAAARLIPEGAVVAIDDSTTLESLVEKLPQRSPSALITHSLKTMADHR 

VRAGMSDIRLIACAGLYFAETDSFLGKATSAQLNELSADISFVSTTAVRATGEVPALFHPDMEAADTKR 

ALIGIGSVRVLVVDSSKFGSAGVFKVASIEEFDHIIIDQQCTREQRDLLRNSRAQIHVIDHNGDEILDT 

PTEEDF 

>RXS02573 TRANSLATE of: RXS02573.seq check: 9274 from: 1 to: 444 
MTNKTMLVAFDGSPESRRALEYAAKLLQPRTVEILTAWEPLHRQAARSVSLITLGVEPEDPAHSAALKT 
CQEGVELAQSLGLEARAHMVESATAVWSAIVDAADELRPDVIVTGTRGISGWKSLWQSSTSDSVLHHAD 
VPVFVVPPLD 

>RXS02627 TRANSLATE of: RXS02627.seq check: 3594 from: 1 to: 843 

DVTVESQPERVVALGWGDAEAALEFGVQPVGASDWLAFGGEGVGPWIEDSAYDEAPEIIGTMEPEYEKI 

AALEPDLILDVRSSGDQERYDKLSSIALTIGVPEGGDSYLTPRAEQVTMIATALGQAERGEEVNAEYEQ 

LTADIRAAHPGWPEKTAAAVSATATSWGAYIKGSNRVDTLLDLGFQENPELAKQQPGDTGFSIKFSEET 

FGVVDSDLVVGFAIGMTPEEMAEQVPWQMLTATRDGRSFVMPREISNAFSLGSPQSTRFALDALVPLLE 

EHAGE 

>RXS02691 TRANSLATE of: RXS02691.seq check: 1824 from: 1 to: 807 

MNTMPDQPLNQDGFPTASKGVEPDNLPDRVLVDGLKPKHQQLREILEEICTTQLQPGDMLPGERILEEK 

YGVSRITVRRAIGDLVASGRLKRARGKGTFVAHSPLISRLHLASFSAEMAAQKLSATSRILSSSRGPAP 

DDIADFFGTDRAAQHITLRRLRFGNGRPYAIDNGWYNSEFAPDLLENDVYNSVYSILDRVYGVPVTQAE 

QTVTAVAADEDTARLLDVTPGAPLLRILRQSLSGDKPVEWCVSLYRTDRYSLKTLVTRSEDL 

>RXS02730 TRANSLATE of: RXS02730.seq check: 6607 from: 1 to: 1038 

MATEKFRPTLKDVARQAGVSIATASRALADNPAVAASTRERIQQLASDLGYRANAQARALRSSRSNTIG 

VIVPSLINHYFAAMVTEIQSTASKAGLATIITNSNEDATTMSGSLEFLTSHGVDGIICVPNEECANQLE 

DLQKQGMPVVLVDRELPGDSTIPTATSNPQPGIAAAVELLAHNNALPIGYLSGPMDTSTGRERLEDFKA 

ACANSKIGEQLVFLGGYEQSVGFEGATKLLDQGAKTLFAGDSMMTIGVIEACHKAGLVIGKDVSVIGFD 

THPLFALQPHPLTVIDQNVEQLAQRAVSILTELIAGTVPSVTKTTIPTALIHRESIINSTLRKKDGLPN 

E 

>RXS02818 TRANSLATE of: RXS02818.seq check: 4037 from: 1 to: 606 
SYSRKFLTQVWIRDNVGDYKGLTDTAFRKKLQRDLAYLRRVGVPIEQFTVTSGIAEGQQAYRLAQDSYK 
LPEVEFTPDEAAVLGMAGEMGHNQELGAFARSGWTKLAAGGAQRDLSTSTALTNAGDLGSLSAKTLDAI 
IKARQLGKQISFEYRRAPKDAPSLRHMDPWGLVPERDRIYLVGFDLDRQEARTFRITRVRNIKL 
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MRTSKKEMILRTAIDYIGEYSLETLSYDSLAEATGLSKSGLIYHFPSRHALLLGMHELLADDWDKELRD 
ITRDPEDPLERLRAVVVTLAENVSRPELVLLMDAPSHPGFLNAWRTVNHQWIPDTDDLENDAHKRAVYS 
GAARSRWPLRARLHS 

>RXS03066 TRANSLATE of: RXS03066.seq check: 1968 from: 1 to: 663 

MTSDKDTEQLEAAGTEILMPRRRPAQQRSRERFNRILTAARSVLVDLGFESFTFDEVAKRAEVPIGTLY 

QFFANKYVLICELDRVDTAEAVAELKKFSDQVPALQWPDILDEFIEHLARLWRDDPSRRAVWHAIQSTP 

ATRATAAATEKEMLEIIAEVMRPLARGAGYEERMSLAGLLVHTVSSLLNYAVRDVNSSEEDFDSIVEEI 

KRMLISYLFSVATG 



>RXS03200 TRANSLATE of: RXS03200.seq check: 1368 from: 1 to: 894 

EKLLPFAKSTLDAAESFLSHAKGANGSLTGPLTVGIIPTAAPYILPSMLSIVDEEYPDLEPHIVEDQTK 

HLLALLRDGAIDVAM^4ALPSEAPGMKEIPLYDEDFIVVTASDHPFAGRQDLELSALEDLDLLLLDDGHC 

LHDQIVDLCRRGDINPISSTTAVTRASSLTTVMQLVVAGLGSTLVPISAIPWECTRPGLATANFNSDVT 

ANRRIGLVYRSSSSRAEEFEQFALILQRAFQEAVALAASTGITLK 

>RXS03208 TRANSLATE of: RXS03208.seq check: from: 1 to: 262 
VKDLVDTTEMYLRTIYELEEEGIVPLRARIAERLEQSGPTVSQTVARMERDGLVHVSPDRSLEMTPEGR 
SLAIAVMRKHRLAERLLTDIIGLDIHKVHDEACRWEHVMSDEVERRLVEVLDDVHRSPFGNPIPGLGEI 
GLDQADEPDSGVRAI D 

>RXS03219 TRANSLATE of: RXS03219.seq check: from 1 to: 978 

VKLTDAAREAGVGYGTASRAISGRGSVDAATRDKVLAAAEKLGYRTNAMARALRENKTRTVGLIVPGII 

NFYTESATVLQDELDKSGYQLVVSTTGNDAEKERRAIESMLNRQVDAVVHAPVNPQAKFPKGFKVVELN 

RRSDLNRPTVTSDDATGLKELALHILDQGYRDIGIIVGPAELSTARDRKAGFINALETEATQRGIREEL 

RFRVVHSRYSPTGGYEAFAEFRNDLPQIVVPLSTQLTLGVLKATQNGIKISDDLSLACYGVAEWLAVWG 

PGITVFAPDLPAMGAAAATQVLTLLDAAPLPENHLSIPGQLIVRGTTPKV 
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